Preserving record order with spark-excel #704

james-miles-ccy · 2023-01-11T09:58:47Z

james-miles-ccy
Jan 11, 2023

I am wondering if there is a way of preserving the order of the records when reading an excel file with spark-excel. There is (as expected) row shuffling happening during the read process and I would normally get round this by firstly creating an rdd with zipwithindex(). Is there any other way of getting round this behaviour for excel files without editing source?

nightscape · 2023-01-11T15:38:29Z

nightscape
Jan 11, 2023
Maintainer

There were some discussions about adding a row number to each row, but afair it was never implemented.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preserving record order with spark-excel #704

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Preserving record order with spark-excel #704

james-miles-ccy Jan 11, 2023

Replies: 1 comment

nightscape Jan 11, 2023 Maintainer

james-miles-ccy
Jan 11, 2023

nightscape
Jan 11, 2023
Maintainer