Skip to content

Commit

Permalink
[BUG] add sort after running passes (#1545)
Browse files Browse the repository at this point in the history
* Verified fixes parquet reading from s3 on `19.parquet`
* Bug occurred when none of our passes made a change to the set of
ranges so then the new rangelist didn't overwrite the current value.
  • Loading branch information
samster25 authored Oct 30, 2023
1 parent ac107ae commit 1d21bf9
Showing 1 changed file with 6 additions and 2 deletions.
8 changes: 6 additions & 2 deletions src/daft-parquet/src/read_planner.rs
Original file line number Diff line number Diff line change
Expand Up @@ -144,16 +144,20 @@ impl ReadPlanner {
self.ranges = ranges;
}
}

Ok(())
}

pub fn collect(
self,
mut self,
io_client: Arc<IOClient>,
io_stats: Option<IOStatsRef>,
) -> DaftResult<Arc<RangesContainer>> {
let mut entries = Vec::with_capacity(self.ranges.len());

// We have to sort again to maintain the invariant of the list being sorted after running passes
// We also have to do this before the loop so we spawn tokio tasks front to back of the file
self.ranges.sort_by_key(|v| v.start);

for range in self.ranges {
let owned_io_client = io_client.clone();
let owned_url = self.source.clone();
Expand Down

0 comments on commit 1d21bf9

Please sign in to comment.