-
Notifications
You must be signed in to change notification settings - Fork 599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support empty arrays, improve ibis.array() API #9473
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -5,16 +5,17 @@ | |||||
|
||||||
from public import public | ||||||
|
||||||
import ibis | ||||||
import ibis.expr.datatypes as dt | ||||||
import ibis.expr.operations as ops | ||||||
import ibis.expr.types as ir | ||||||
from ibis.common.annotations import ValidationError | ||||||
from ibis.common.deferred import Deferred, deferrable | ||||||
from ibis.expr.types.generic import Column, Scalar, Value | ||||||
|
||||||
if TYPE_CHECKING: | ||||||
from collections.abc import Callable, Iterable | ||||||
|
||||||
import ibis.expr.types as ir | ||||||
from ibis.expr.types.typing import V | ||||||
|
||||||
import ibis.common.exceptions as com | ||||||
|
||||||
|
||||||
|
@@ -1067,7 +1068,11 @@ | |||||
|
||||||
@public | ||||||
@deferrable | ||||||
def array(values: Iterable[V]) -> ArrayValue: | ||||||
def array( | ||||||
values: ArrayValue | Iterable | ir.NullValue | None, | ||||||
*, | ||||||
type: str | dt.DataType | None = None, | ||||||
) -> ArrayValue: | ||||||
"""Create an array expression. | ||||||
|
||||||
If any values are [column expressions](../concepts/datatypes.qmd) the | ||||||
|
@@ -1078,6 +1083,9 @@ | |||||
---------- | ||||||
values | ||||||
An iterable of Ibis expressions or Python literals | ||||||
type | ||||||
An instance of `ibis.expr.datatypes.DataType` or a string indicating | ||||||
the Ibis type of `value`. eg `array<float>`. | ||||||
|
||||||
Returns | ||||||
------- | ||||||
|
@@ -1108,15 +1116,49 @@ | |||||
│ [3, 42, ... +1] │ | ||||||
└──────────────────────┘ | ||||||
|
||||||
>>> ibis.array([t.a, 42 + ibis.literal(5)]) | ||||||
>>> ibis.array([t.a, 42 + ibis.literal(5)], type="array<float>") | ||||||
┏━━━━━━━━━━━━━━━━━━━━━━┓ | ||||||
┃ Array() ┃ | ||||||
┡━━━━━━━━━━━━━━━━━━━━━━┩ | ||||||
│ array<int64> │ | ||||||
│ array<float64> │ | ||||||
├──────────────────────┤ | ||||||
│ [1, 47] │ | ||||||
│ [2, 47] │ | ||||||
│ [3, 47] │ | ||||||
│ [1.0, 47.0] │ | ||||||
│ [2.0, 47.0] │ | ||||||
│ [3.0, 47.0] │ | ||||||
└──────────────────────┘ | ||||||
""" | ||||||
return ops.Array(tuple(values)).to_expr() | ||||||
type = dt.dtype(type) if type is not None else None | ||||||
if type is not None and not isinstance(type, dt.Array): | ||||||
raise ValidationError(f"type must be an array, got {type}") | ||||||
|
||||||
if isinstance(values, ir.Value): | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think we're going to accept inputs like this. Can you get rid of this code path? |
||||||
if type is not None: | ||||||
return values.cast(type) | ||||||
elif isinstance(values, ArrayValue): | ||||||
return values | ||||||
else: | ||||||
raise ValidationError( | ||||||
f"If no type passed, values must be an array, got {values.type()}" | ||||||
) | ||||||
|
||||||
if values is None: | ||||||
if type is None: | ||||||
raise ValidationError("If values is None/NULL, type must be provided") | ||||||
return ir.null(type) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't like this. It's unnecessary to have to support another way to construct a null value. If someone wants a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have a thought here but I better do it at my computer There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See these tests in mismo: I need to construct an
There are three different ibis APIs for doing this:
I think it can be summarized as SOMEONE is gonna need to do this |
||||||
|
||||||
values = tuple(values) | ||||||
if len(values) == 0: | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please use standard Python idioms for this.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oops sorry will do. |
||||||
if type is None: | ||||||
raise ValidationError("If values is empty, type must be provided") | ||||||
return ops.EmptyArray(type).to_expr() | ||||||
else: | ||||||
value_type = type.value_type if type is not None else None | ||||||
values = [_value(v, value_type) for v in values] | ||||||
return ops.Array(values).to_expr() | ||||||
|
||||||
|
||||||
def _value(x, type) -> ir.Value: | ||||||
if isinstance(x, (ir.Value, Deferred)): | ||||||
return x.cast(type) if type is not None else x | ||||||
else: | ||||||
return ibis.literal(x, type=type) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is the purpose of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If you do There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm still not sure why this is required. Is there some API that fails to execute if you don't do these casts? What if the elements themselves are arrays? Doesn't this then need to be recursive in the casting/call to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shoot, github escaped the Now does the concern make sense? Two ways of storing this:
I am trying to go for option 2.
Specificaly, if I remove the .literal() here, then we pass a plain python
you mean what if x were a python list of ibis values? ie someone did |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just let this fall through and bubble up as it would? This function is already getting stuffed with a bunch of branching, and this doesn't seem to add much value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guarding against the footgun of someone passing in "float" instead of "array" (I literally accidentally did this as I adjusted the tests this last round), which would silently give the wrong result for some inputs. I agree not needed, I can remove, just a nicety