-
Notifications
You must be signed in to change notification settings - Fork 174
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[FEAT] [JSON Reader] Add native streaming + parallel JSON reader. (#1679
) This PR adds a streaming + parallel JSON reader, with full support for most fundamental dtypes (sans decimal and binary types), arbitrary nesting with JSON lists and objects, including nulls at all levels of the JSON object tree. ## TODOs - [x] Add schema inference unit test for dtype coverage (i.e. reading the `dtypes.jsonl` file). - [x] Add temporal type inference + parsing test coverage. - [x] Benchmarking + performance audit: this reader follows the same general concurrency + parallelism model of the streaming CSV reader, which performs relatively well for cloud reads, but there's bound to be a lot of low-hanging fruit around unnecessary copies. - [ ] (Follow-up?) Add thorough parsing and dtype inference unit tests on in-memory defined JSON strings. - [ ] (Follow-up) Support for decimal and (large) binary types. - [ ] (Follow-up) Add support for strict parsing, i.e. returning an error instead of falling back to a null value when parsing fails. - [ ] (Follow-up) Misc. bugs in Arrow2 that should be fixed and upstreamed. - [ ] (Follow-up) Deflate compression support.
- Loading branch information
1 parent
a53cd51
commit 3693c22
Showing
56 changed files
with
3,657 additions
and
268 deletions.
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.