Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEAT]: expr simplifier #3393

Merged
merged 15 commits into from
Dec 5, 2024

Conversation

universalmind303
Copy link
Contributor

@universalmind303 universalmind303 commented Nov 21, 2024

still todo:

  • add a bunch of test cases.
    - [ ] add simplification for AND exprs

Note for reviewers:

This bypasses the expr simplifier rule if the source is SQLScanOperator it was simplifying the queries in a way that weren't working with connectorx. I haven't been able to debug exactly what was going on, but I think it's pretty ok to just skip this optimization for this source for now.

Also, wanted to implement a const expr evaluator, but since our eval_expressions is associated with daft_table::Table, this kind of optimization is not currently possible, and some refactoring is needed.

@github-actions github-actions bot added the enhancement New feature or request label Nov 21, 2024
Copy link

codspeed-hq bot commented Nov 21, 2024

CodSpeed Performance Report

Merging #3393 will improve performances by 53.05%

Comparing universalmind303:expr-simplifier (5480075) with main (c2abed8)

Summary

⚡ 2 improvements
✅ 15 untouched benchmarks

Benchmarks breakdown

Benchmark main universalmind303:expr-simplifier Change
test_iter_rows_first_row[100 Small Files] 228.4 ms 204.7 ms +11.6%
test_show[100 Small Files] 23.5 ms 15.3 ms +53.05%

@universalmind303 universalmind303 marked this pull request as ready for review November 25, 2024 18:12
@universalmind303
Copy link
Contributor Author

I'm going to split out "add simplification for AND exprs" into a separate PR.

@kevinzwang kevinzwang self-requested a review November 26, 2024 23:31
Copy link

codecov bot commented Dec 3, 2024

Codecov Report

Attention: Patch coverage is 86.53367% with 54 lines in your changes missing coverage. Please review.

Project coverage is 77.49%. Comparing base (c2abed8) to head (5480075).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...lan/src/optimization/rules/simplify_expressions.rs 83.71% 36 Missing ⚠️
src/daft-logical-plan/src/treenode.rs 92.14% 11 Missing ⚠️
src/common/scan-info/src/test/mod.rs 0.00% 3 Missing ⚠️
src/daft-scan/src/anonymous.rs 0.00% 3 Missing ⚠️
daft/sql/sql_scan.py 50.00% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3393      +/-   ##
==========================================
+ Coverage   77.42%   77.49%   +0.07%     
==========================================
  Files         702      703       +1     
  Lines       85210    85641     +431     
==========================================
+ Hits        65973    66371     +398     
- Misses      19237    19270      +33     
Files with missing lines Coverage Δ
daft/delta_lake/delta_lake_scan.py 77.77% <100.00%> (+0.35%) ⬆️
daft/hudi/hudi_scan.py 87.95% <100.00%> (+0.29%) ⬆️
daft/iceberg/iceberg_scan.py 90.62% <100.00%> (+0.14%) ⬆️
daft/io/_generator.py 94.87% <100.00%> (+0.27%) ⬆️
daft/io/_lance.py 93.10% <100.00%> (+0.24%) ⬆️
src/common/scan-info/src/scan_operator.rs 50.00% <ø> (ø)
...rc/daft-logical-plan/src/optimization/optimizer.rs 98.82% <100.00%> (+0.03%) ⬆️
src/daft-scan/src/glob.rs 89.91% <100.00%> (+0.09%) ⬆️
src/daft-scan/src/python.rs 70.21% <100.00%> (+0.64%) ⬆️
daft/sql/sql_scan.py 25.89% <50.00%> (+0.35%) ⬆️
... and 4 more

... and 8 files with indirect coverage changes

Copy link
Member

@kevinzwang kevinzwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just have a small comment

Comment on lines 97 to 102
// we want to simplify expressions first to make the rest of the rules easier
rule_batches.push(RuleBatch::new(
vec![Box::new(SimplifyExpressionsRule::new())],
RuleExecutionStrategy::FixedPoint(Some(3)),
));

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put this under the rewrite rules since in my mind, the plan is in an invalid state before the rewrite rules are applied, so the rewrite rules should not rely on any optimizer rules but the optimizer rules can make assumptions about the validity of the plan.

@universalmind303 universalmind303 enabled auto-merge (squash) December 5, 2024 20:13
@universalmind303 universalmind303 merged commit e1c3faf into Eventual-Inc:main Dec 5, 2024
40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants