-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEAT] Enable group by keys in aggregation expressions #3399
Conversation
CodSpeed Performance ReportMerging #3399 will degrade performances by 13.57%Comparing Summary
Benchmarks breakdown
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3399 +/- ##
==========================================
+ Coverage 76.54% 77.17% +0.63%
==========================================
Files 685 685
Lines 85269 83696 -1573
==========================================
- Hits 65266 64593 -673
+ Misses 20003 19103 -900
|
Graphite Automations"Request reviewers once CI passes" took an action on this PR • (11/22/24)1 reviewer was added to this PR based on Andrew Gazelka's automation. |
src/daft-dsl/src/resolve_expr/mod.rs
Outdated
@@ -208,7 +208,7 @@ fn expand_wildcards( | |||
} | |||
|
|||
/// Checks if an expression used in an aggregation is well formed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
/// Checks if an expression used in an aggregation is well formed. | |
/// Checks if an expression used in an aggregation is well-formed. | |
adding new line makes look better in rustdocs
https://rust-lang.github.io/rust-clippy/master/index.html#too_long_first_doc_paragraph
src/daft-dsl/src/resolve_expr/mod.rs
Outdated
fn has_single_agg_layer(expr: &ExprRef, groupby: &HashSet<ExprRef>) -> bool { | ||
groupby.contains(expr) | ||
|| match expr.as_ref() { | ||
Expr::Agg(agg_expr) => !agg_expr.children().iter().any(has_agg), | ||
Expr::Column(_) => false, | ||
Expr::Literal(_) => true, | ||
_ => expr | ||
.children() | ||
.iter() | ||
.all(|e| has_single_agg_layer(e, groupby)), | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm I'm not sure this function really tests has_single_agg_layer
anymore. Maybe rename or rethink logic? or make into two separate functions?
also potential bug (although unsure it could even occur):
groupby contains expr with multiple aggs -> returns true (as in single agg) even though not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point but I'm not really sure what to name it. I think the logic of the function itself is fairly straightforward though. I'll give it a thought
As for the bug, group bys should never have aggregations in them and will fail in another place in the code if they do.
src/daft-dsl/src/resolve_expr/mod.rs
Outdated
} | ||
/// - sum(col("a")) + col("b") when "b" is not a group by key | ||
/// - not all branches are aggregations, literals, or group by keys | ||
fn has_single_agg_layer(expr: &ExprRef, groupby: &HashSet<ExprRef>) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we move has_single_agg_layer
and validate_expr_in_agg
to be a member of ExprResolver
. It looks like they're only ever used as part of the expr resolving.
/// - sum(col("a")) + col("b") when "b" is not a group by key | ||
/// - not all branches are aggregations, literals, or group by keys | ||
fn is_valid_expr_in_agg(&self, expr: &ExprRef) -> bool { | ||
self.groupby.unwrap().contains(expr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'd add a comment about how self.groupby.unwrap()
is safe here since it's checked previously on l:247
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops didn't realize you had some other comments. Will make a quick follow-up PR for these
No description provided.