Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kurtosis udaf #4

Merged
merged 9 commits into from
Sep 28, 2024
Merged

Conversation

jatin510
Copy link
Contributor

@jatin510 jatin510 commented Sep 25, 2024

Which issue does this PR close?

Closes #12250 apache/datafusion#12250

Rationale for this change

I followed the changes in PR: apache/datafusion#12273

What changes are included in this PR?

Added a udaf to calculate the kurtosis.

Are these changes tested?

Added the test in main.rs file took reference from mode udaf.

Are there any user-facing changes?

@jatin510 jatin510 force-pushed the feature/kurtosis_udaf branch from c9444da to 26f4e39 Compare September 25, 2024 20:37
@demetribu
Copy link
Collaborator

@jatin510 Thank you
Could you add information about the function to the README.

src/lib.rs Outdated
@@ -26,14 +26,15 @@ use datafusion::logical_expr::AggregateUDF;
#[macro_use]
pub mod macros;
pub mod common;
pub mod kurtosis;
pub mod mode;

pub mod expr_extra_fn {
pub use super::mode::mode;
Copy link
Collaborator

@demetribu demetribu Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls add
pub use super::kurtosis::kurtosis;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

README.md Outdated
```

## Done

- [x] `mode(expression) -> scalar` - Returns the most frequent (mode) value from a column of data.
- [x] `max_by(expression1, expression2) -> scalar` - Returns the value of `expression1` associated with the maximum value of `expression2`.
- [x] `min_by(expression1, expression2) -> scalar` - Returns the value of `expression1` associated with the minimum value of `expression2`.
- [x] `kurtosis(expression) -> scalar` - Returns the kurtosis, a measure of the `tailedness` of the distribution, for the given numeric values in `expression`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This documentation is not same as one in code, we should have consistent documentation everywhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made all the mentioned changes @dharanad

Comment on lines +44 to +50
impl Debug for KurtosisFunction {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
f.debug_struct("KurtosisFunction")
.field("signature", &self.signature)
.finish()
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think #[derive(Debug)] macro should serve the purpose here.

Copy link
Collaborator

@demetribu demetribu Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that an explicit implementation of Debug will be more informative here.

README.md Outdated
@@ -77,10 +77,20 @@ SELECT min_by(x, y) FROM VALUES (1, 10), (2, 5), (3, 15), (4, 8) as tab(x, y);
-- | 2 |
-- +---------------------+

-- Get the kurtosis value of a column of numeric data.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we have enough example, just a description in "Done" section, will be fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made all the mentioned changes @dmitrybugakov

@jatin510 jatin510 force-pushed the feature/kurtosis_udaf branch from b98c6bb to 33fbfd4 Compare September 27, 2024 19:14
Copy link
Contributor

@dharanad dharanad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @jatin510

@jatin510 jatin510 force-pushed the feature/kurtosis_udaf branch from 1cb20dc to a0cf8a8 Compare September 28, 2024 09:49
@jatin510 jatin510 requested a review from demetribu September 28, 2024 09:51
Copy link
Collaborator

@demetribu demetribu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm
@jatin510 thank you
@dharanad thanks for the help

@demetribu demetribu merged commit 9f30214 into datafusion-contrib:main Sep 28, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants