-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial Polars support #1576
base: mainline
Are you sure you want to change the base?
Initial Polars support #1576
Conversation
4a92544
to
2e26e4e
Compare
|
||
|
||
def test_validator_double_register_udf_polars() -> None: | ||
global action_list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can drop this test.
d = {g: grouped.get_group(g) for g in grouped.groups.keys()} | ||
return d |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
d = {g: grouped.get_group(g) for g in grouped.groups.keys()} | |
return d |
if not isinstance(dtype_or_type, type): | ||
return False | ||
|
||
if issubclass(dtype_or_type, (bool, int, np.number, np.bool_)): | ||
if issubclass(dtype_or_type, (bool, int, np.number, np.bool_, pl.datatypes.IntegerType)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if issubclass(dtype_or_type, (bool, int, np.number, np.bool_, pl.datatypes.IntegerType)): | |
if issubclass(dtype_or_type, (bool, int, np.number, np.bool_)): |
SplitSeries with multiple values, including numpy arrays for numbers, and strings as a Polars Series. | ||
""" | ||
|
||
# TODO: add a PolarsView, or convert PandasView to work with both Polars & Pandas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should extract a View
base class and make ListView, NumpyView, PandasView, PolarsView <: View
. PolarsView
might be better than NumpyView
for ints and floats. PreprocessedColumn
would contain a list of View
instances rather than members of each specific View
subclass.
except Exception as e: # noqa | ||
new_df[new_col] = pd.Series([None]) | ||
new_df[new_col] = df.apply_udf(lambda x: float("nan")) # should be None, but can't infer type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be better to just leave out the failed UDF's output column.
python/.pre-commit-config.yaml
Outdated
exclude: python/whylogs/core/proto/|python/docs/|python/whylogs/viz/html/|java|python/whylogs/api/logger/experimental/logger | ||
- repo: https://github.com/pre-commit/mirrors-mypy | ||
rev: v0.942 | ||
hooks: | ||
- id: mypy | ||
language: system | ||
exclude: ^(python/tests/|python/examples/|python/examples/integration/|python/whylogs/core/proto/|python/docs/|python/whylogs/viz/html/|java|python/whylogs/api/logger/experimental/logger) | ||
files: ^(python/whylogs/) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not positive this check the correct files, but this was the only way I found to get it to not run lint checks on site-packages
.
Description
Support for logging Polars data frames.
Closes #1230