-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEAT] Adds SQL function modules #2725
Conversation
The workflows fail due to missing
I am inclined to remove the single unit test to help get this in sooner, but I am not sure the best approach for the handling of aggregations. It may not actually be much effort to add them here just like you had done for max. Please let me know how you would prefer to proceed, thanks! |
CodSpeed Performance ReportMerging #2725 will not alter performanceComparing Summary
|
src/daft-sql/src/functions.rs
Outdated
/// ``` | ||
ToDatetime, | ||
/// [SQLFunction] extends [FunctionExpr] with basic input validation (arity-check). | ||
pub struct SQLFunction(FunctionExpr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the issue with using FunctionExpr
here is that it prevents us from using our newer exprs that have been moved over to daft-functions
. Eventually we plan to migrate all of the FunctionExpr
impls over to the newer ScalarUDF
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make this work with aggregates (max
), as well as ScalarUDF
's, we'll likely want to use a trait object instead
pub trait SQLFunction {/*...*/}
pub struct SQLFunctions {
map: HashMap<String, Arc<dyn SQLFunction>>,
}
In this revision, I've only added aggregations – I've left some notes for later PRs which might extend an existing trait upon completion of the FunctionExpr -> ScalarUDF migration. Please let me know if you wish to include ScalarUDF in this as well, or if it's better to get this PR in as-is with a follow-up PR. Thanks! |
impl SQLModule for SQLModuleAggs { | ||
fn register(parent: &mut SQLFunctions) { | ||
use AggExpr::*; | ||
// HACK TO USE AggExpr as an enum rather than a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than a ...?
let nil = Arc::new(Expr::Literal(LiteralValue::Null)); | ||
parent.add_agg("count", Count(nil.clone(), daft_core::CountMode::Valid)); | ||
parent.add_agg("sum", Sum(nil.clone())); | ||
parent.add_agg("avg", Mean(nil.clone())); | ||
parent.add_agg("mean", Mean(nil.clone())); | ||
parent.add_agg("min", Min(nil.clone())); | ||
parent.add_agg("max", Max(nil.clone())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let nil = Arc::new(Expr::Literal(LiteralValue::Null)); | |
parent.add_agg("count", Count(nil.clone(), daft_core::CountMode::Valid)); | |
parent.add_agg("sum", Sum(nil.clone())); | |
parent.add_agg("avg", Mean(nil.clone())); | |
parent.add_agg("mean", Mean(nil.clone())); | |
parent.add_agg("min", Min(nil.clone())); | |
parent.add_agg("max", Max(nil.clone())); | |
use daft_dsl::null_lit; | |
parent.add_agg("count", Count(null_lit(), daft_core::CountMode::Valid)); | |
parent.add_agg("sum", Sum(null_lit()); | |
parent.add_agg("avg", Mean(null_lit()); | |
parent.add_agg("mean", Mean(null_lit()); | |
parent.add_agg("min", Min(null_lit()); | |
parent.add_agg("max", Max(null_lit()); |
This commit defines SQLFunctions and SQLModule which are responsible for registering the Daft FunctionExpr to be accessible from SQL statements. It is basic and operates on-top-of FunctionExpr .. that is, without modifying it.
Overview
Future Considerations