-
Notifications
You must be signed in to change notification settings - Fork 1
Batch Handling Upgrades
Outline
-
Materialized Field. Column Metadata. Vector structures. Repeated lists. List vector. Union vector.
-
Row set level. Build batch from schema. Unit testing framework.
-
Vector readers. Object categories. Vector indexes. Vector accessors. Array accessors. Generated code. Array wrappers for nullable, arrays.
-
Row-set writers. Top-level writers. Structure. Writing to arrays. Events. Offset vector updates.
-
Row set loader. Concept of overflow. Column states. Vector states. Overflow processing. Vector allocation. Vector cache and multi-reader model.
-
Operator framework. Split of concerns. Protocol adapter. Schema change detection.
-
Projection framework. Concepts. Project lists. Null columns. Implicit columns. Assembling the output batch. Column information in projection list. Recursive projection in maps. Schema smoothing and persistence.
-
Mock reader. CSV reader. Easy format plugin. Concept of Parquet support.
-
JSON concepts. JSON issues. Revised JSON parser. JSON semantics. Open issues. Possible opportunities.
-
Future opportunities. Code generation. Plugin APIs. Reader retrofits. Fixed-size buffers.