Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] [Query Planner] Properly track ascending/descending sort order for range partitioning and sorting. #1862

Merged
merged 4 commits into from
Feb 14, 2024

Conversation

clarkzinzow
Copy link
Contributor

@clarkzinzow clarkzinzow commented Feb 9, 2024

This PR ensures that we properly track ascending/descending sort order for range partitioning and sorting ops, to ensure that downstream operations correctly interpret the partition spec.

For example, before this PR, the planner would always assume that if both sides of the join are sorted, then they're sorted in ascending order and could therefore be efficiently joined with our sort-merge join (which currently only does ascending-order sorts/merges); this would lead to incorrect results if either side was sorted in descending order.

Closes #1829

Follow-ups (future PRs)

  • Refactor sort-merge join to work for both ascending and descending order; this can be done when porting the Partitioning abstraction to Rust to enable better dtype support.

@github-actions github-actions bot added the bug Something isn't working label Feb 9, 2024
Copy link

codecov bot commented Feb 9, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (573ea12) 85.48% compared to head (5a69358) 85.55%.
Report is 9 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1862      +/-   ##
==========================================
+ Coverage   85.48%   85.55%   +0.06%     
==========================================
  Files          55       55              
  Lines        6147     6194      +47     
==========================================
+ Hits         5255     5299      +44     
- Misses        892      895       +3     

see 9 files with indirect coverage changes

@clarkzinzow clarkzinzow changed the title [BUG] [Query Planner] Properly tracking ascending/descending sort order for range partitioning and sorting. [BUG] [Query Planner] Properly track ascending/descending sort order for range partitioning and sorting. Feb 9, 2024
}
}

pub fn to_scheme(&self) -> PartitionScheme {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be redundancy between PartitionSchemeConfig and PartitionScheme. We can probably fold in num_partitions and by to the enum variants that are relevant

src/daft-plan/src/planner.rs Outdated Show resolved Hide resolved
src/daft-plan/src/planner.rs Outdated Show resolved Hide resolved
@clarkzinzow clarkzinzow merged commit e47eeda into main Feb 14, 2024
42 checks passed
@clarkzinzow clarkzinzow deleted the clark/partition-spec-config branch February 14, 2024 21:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Asc/desc hotfix for range partitioning
2 participants