-
Notifications
You must be signed in to change notification settings - Fork 174
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat(swordfish): Optimize grouped aggregations (#3534)
Optimize swordfish grouped aggs for high cardinality groups ### Approach There's 3 strategies for grouped aggs: 1. Partition each input morsel into `N` partitions, then do a partial agg. (good for high cardinality). 2. Do a partial agg, then partition into `N` partitions. (good for low cardinality). Can be optimized with #3556 3. Partition only, no partial agg. (only for map_groups, which has no partial agg). ### Notes on alternative approaches - Distributing partitions across workers (i.e. having each worker being responsible for accumulating only one partition) is much slower for low cardinality aggs (TPCH Q1 would have been 1.5x slower). This is because most of the work will end up being on only a few workers, reducing parallelism. - Simply partitioning the input and then only aggregating at the end works well with higher cardinality, but low cardinality takes a hit. (TPCH Q1 would have been 2.5x slower). - Probe Table approach was much slower, due to many calls to the multi-table dyn comparator. It was also much more complex to implement. ### Benchmarks [MrPowers Benchmarks](https://github.com/MrPowers/mrpowers-benchmarks) results (seconds, lower is better). | Query | this PR | Pyrunner | Current swordfish | |-------|---------|----------|-------------------| | q1 | 0.285720| 0.768858 | 0.356499 | | q2 | 4.780064| 6.122199 | 53.340565 | | q3 | 2.201079| 3.922857 | 16.935125 | | q4 | 0.313106| 0.545192 | 0.335541 | | q5 | 1.618228| 2.889354 | 10.665339 | | q7 | 2.087872| 3.856998 | 16.072660 | | q10 | 6.306756| 8.173738 | 53.800501 | --------- Co-authored-by: EC2 Default User <[email protected]> Co-authored-by: Colin Ho <[email protected]> Co-authored-by: EC2 Default User <[email protected]>
- Loading branch information
1 parent
6c21917
commit e148248
Showing
13 changed files
with
537 additions
and
154 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
56 changes: 0 additions & 56 deletions
56
src/daft-local-execution/src/intermediate_ops/aggregate.rs
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,6 @@ | ||
#![feature(let_chains)] | ||
#![feature(option_get_or_insert_default)] | ||
|
||
mod buffer; | ||
mod channel; | ||
mod dispatcher; | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.