Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[CHORE] Write tpch parquet files one at a time (#3396)
When you specify a `num_parts` parameter when generating tpch files. It will first generate `num_parts` CSVs, then read those CSVs and write to parquet using Daft. However, `write_parquet` will not respect the input number of files, e.g. even if there are 16 input files there might only be 1 output file. The fix here is to read and write 1 file at a time. Co-authored-by: Colin Ho <[email protected]>
- Loading branch information