Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support type arg to ibis.array and ibis.map #8289

Open
1 task done
NickCrews opened this issue Feb 8, 2024 · 6 comments
Open
1 task done

feat: Support type arg to ibis.array and ibis.map #8289

NickCrews opened this issue Feb 8, 2024 · 6 comments
Assignees
Labels
feature Features or general enhancements

Comments

@NickCrews
Copy link
Contributor

Is your feature request related to a problem?

This is already supported for ibis.struct.

I want to create an array of type array<uint8>. ibis.array() doesn't support a type param, so if the type can't be inferred from the values, I can't use it:

  • If I want a NULL value, I have to use ibis.literal(None, type="array<uint8>").
  • If I want length-0 value, I have to use ibis.literal([], type="array<uint8>")
  • For longer-length arrays, I am able use ibis.array for the initial construction, but then I have to do a .cast() afterwards to ensure the type is correct: ibis.array([1,2]).type() shows array<int8>, so I have to do ibis.array([1,2]).cast("array<uint8>")

I haven't ever used map types, but I imagine it is a similar story.

It is annoying to use two different constructors depending on the length. There should be one way.

I could just use ibis.literal, but it's docstring says that constructing these complex types using it is going to be deprecated in a future release. If that is the case, then my workarounds have a shelf life. U

Describe the solution you'd like

add a type param to these three functions. It should look just like it does for struct, I like that API.

Then, either:

  • actually follow through with what the docstring says for literal and start the deprecation and removal timeline
  • remove that note on future deprecation. This was added in this commit, but I can't find a reasoning behind it.

What version of ibis are you running?

main

What backend(s) are you using, if any?

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@NickCrews NickCrews added the feature Features or general enhancements label Feb 8, 2024
@chloeh13q
Copy link
Contributor

/take

@chloeh13q chloeh13q moved this from backlog to todo in Ibis planning and roadmap Mar 7, 2024
NickCrews added a commit to NickCrews/ibis that referenced this issue Mar 15, 2024
fixes ibis-project#8289

This does a lot of changes. It was hard for me to separate them out as I implemented them. But now that it's all hashed out, I can try to split this up into separate commits if you want. But that might be sorta hard in
some cases.

Several of the backends were always broken here, they just weren't getting caught. I marked them as broken,
we can fix them in a followup.

You can test this locally with eg
`pytest -m duckdb -k factory ibis/backends/tests/test_array.py  ibis/backends/tests/test_map.py ibis/backends/tests/test_struct.py`

Also, fix a typing bug: map() can accept ArrayValues, not just ArrayColumns

Also, support passing in None.

Also, error when the value type can't be inferred from empty python literals
(eg what is the value type for the elements of []?)

Also, make the type argument for struct() always have an effect, not just when passing in python literals.
So basically it can act like a cast.

Also, make these constructors idempotent.
NickCrews added a commit to NickCrews/ibis that referenced this issue Mar 15, 2024
fixes ibis-project#8289

This does a lot of changes. It was hard for me to separate them out as I implemented them. But now that it's all hashed out, I can try to split this up into separate commits if you want. But that might be sorta hard in
some cases.

Several of the backends were always broken here, they just weren't getting caught. I marked them as broken,
we can fix them in a followup.

You can test this locally with eg
`pytest -m duckdb -k factory ibis/backends/tests/test_array.py  ibis/backends/tests/test_map.py ibis/backends/tests/test_struct.py`

Also, fix a typing bug: map() can accept ArrayValues, not just ArrayColumns

Also, support passing in None.

Also, error when the value type can't be inferred from empty python literals
(eg what is the value type for the elements of []?)

Also, make the type argument for struct() always have an effect, not just when passing in python literals.
So basically it can act like a cast.

Also, make these constructors idempotent.
NickCrews added a commit to NickCrews/ibis that referenced this issue Mar 18, 2024
fixes ibis-project#8289

This does a lot of changes. It was hard for me to separate them out as I implemented them. But now that it's all hashed out, I can try to split this up into separate commits if you want. But that might be sorta hard in
some cases.

Several of the backends were always broken here, they just weren't getting caught. I marked them as broken,
we can fix them in a followup.

You can test this locally with eg
`pytest -m duckdb -k factory ibis/backends/tests/test_array.py  ibis/backends/tests/test_map.py ibis/backends/tests/test_struct.py`

Also, fix a typing bug: map() can accept ArrayValues, not just ArrayColumns

Also, fix executing NULL arrays on pandas.

Also, fix casting structs on pandas.
See ibis-project#8687

Also, support passing in None.

Also, error when the value type can't be inferred from empty python literals
(eg what is the value type for the elements of []?)

Also, make the type argument for struct() always have an effect, not just when passing in python literals.
So basically it can act like a cast.

Also, make these constructors idempotent.
NickCrews added a commit to NickCrews/ibis that referenced this issue Mar 19, 2024
fixes ibis-project#8289

This does a lot of changes. It was hard for me to separate them out as I implemented them. But now that it's all hashed out, I can try to split this up into separate commits if you want. But that might be sorta hard in
some cases.

Several of the backends were always broken here, they just weren't getting caught. I marked them as broken,
we can fix them in a followup.

You can test this locally with eg
`pytest -m duckdb -k factory ibis/backends/tests/test_array.py  ibis/backends/tests/test_map.py ibis/backends/tests/test_struct.py`

Also, fix a typing bug: map() can accept ArrayValues, not just ArrayColumns

Also, fix executing NULL arrays on pandas.

Also, fix casting structs on pandas.
See ibis-project#8687

Also, support passing in None.

Also, error when the value type can't be inferred from empty python literals
(eg what is the value type for the elements of []?)

Also, make the type argument for struct() always have an effect, not just when passing in python literals.
So basically it can act like a cast.

Also, make these constructors idempotent.
@chloeh13q
Copy link
Contributor

@NickCrews IIUC, your PR #8666 should also resolve this. If so, I will re-assign this issue to you.

@NickCrews
Copy link
Contributor Author

That would work. Also, that PR is turning into a bit of a doozy, if you wanted to help out there it would be appreciated!

@chloeh13q
Copy link
Contributor

@NickCrews Sure, I can take a look!

@chloeh13q chloeh13q assigned NickCrews and unassigned chloeh13q Mar 20, 2024
NickCrews added a commit to NickCrews/ibis that referenced this issue Mar 22, 2024
fixes ibis-project#8289

This does a lot of changes. It was hard for me to separate them out as I implemented them. But now that it's all hashed out, I can try to split this up into separate commits if you want. But that might be sorta hard in
some cases.

The big structural change is that now the core Operations for Array and Structs have a different internal representation, so they can distringuish between
- the entire value is NULL
- the contained values are NULL

Before, ops.Array held onto a `VarTuple[Value]`. So the contained Values could be NULL, but there was no way to say the entire thing was null. Now, ops.Array stores a `None | VarTuple[Value]`. The same thing for ops.StructValue. ops.Map didn't suffer from this, because it stores a `ops.Array`s internally, so since `ops.Array` can distinguish between entirely-NULL and contains-NULL, so can ops.Map

A fallout of this is that ops.Array needs a way to explicitly store its dtype. Before, it derived its dtype based on the dtype of its args. But
now that `None` is a valid value, it is now possible for there to be no values to inspect! So the Op actually stores its dtype explicitly. If you pass in values, then supplying the dtype on construction is optional, we go back to the old behavior of deriving it from the inputs.

This requires the backend compilers to now deal with that case.

Several of the backends were always broken here, they just weren't getting caught. I marked them as broken,
we can fix them in a followup.

You can test this locally with eg
`pytest -m <backend> -k factory ibis/backends/tests/test_array.py  ibis/backends/tests/test_map.py ibis/backends/tests/test_struct.py`

Also, fix a typing bug: map() can accept ArrayValues, not just ArrayColumns

Also, fix executing NULL arrays on pandas.

Also, fixing converting dtypes on clickhouse, Structs should be converted to nonnullable dtypes.

Also, fix casting structs on pandas.
See ibis-project#8687

Also, support passing in None to all these constructors.

Also, error when the value type can't be inferred from empty python literals
(eg what is the value type for the elements of []?)

Also, make the type argument for struct() always have an effect, not just when passing in python literals.
So basically it can act like a cast.

Also, make these constructors idempotent.
NickCrews added a commit to NickCrews/ibis that referenced this issue Mar 22, 2024
fixes ibis-project#8289

This does a lot of changes. It was hard for me to separate them out as I implemented them. But now that it's all hashed out, I can try to split this up into separate commits if you want. But that might be sorta hard in
some cases.

The big structural change is that now the core Operations for Array and Structs have a different internal representation, so they can distringuish between
- the entire value is NULL
- the contained values are NULL

Before, ops.Array held onto a `VarTuple[Value]`. So the contained Values could be NULL, but there was no way to say the entire thing was null. Now, ops.Array stores a `None | VarTuple[Value]`. The same thing for ops.StructValue. ops.Map didn't suffer from this, because it stores a `ops.Array`s internally, so since `ops.Array` can distinguish between entirely-NULL and contains-NULL, so can ops.Map

A fallout of this is that ops.Array needs a way to explicitly store its dtype. Before, it derived its dtype based on the dtype of its args. But
now that `None` is a valid value, it is now possible for there to be no values to inspect! So the Op actually stores its dtype explicitly. If you pass in values, then supplying the dtype on construction is optional, we go back to the old behavior of deriving it from the inputs.

This requires the backend compilers to now deal with that case.

Several of the backends were always broken here, they just weren't getting caught. I marked them as broken,
we can fix them in a followup.

You can test this locally with eg
`pytest -m <backend> -k factory ibis/backends/tests/test_array.py  ibis/backends/tests/test_map.py ibis/backends/tests/test_struct.py`

Also, fix a typing bug: map() can accept ArrayValues, not just ArrayColumns

Also, fix executing NULL arrays on pandas.

Also, fixing converting dtypes on clickhouse, Structs should be converted to nonnullable dtypes.

Also, fix casting structs on pandas.
See ibis-project#8687

Also, support passing in None to all these constructors.

Also, error when the value type can't be inferred from empty python literals
(eg what is the value type for the elements of []?)

Also, make the type argument for struct() always have an effect, not just when passing in python literals.
So basically it can act like a cast.

Also, make these constructors idempotent.
NickCrews added a commit to NickCrews/ibis that referenced this issue May 1, 2024
fixes ibis-project#8289

This does a lot of changes. It was hard for me to separate them out as I implemented them. But now that it's all hashed out, I can try to split this up into separate commits if you want. But that might be sorta hard in
some cases.

The big structural change is that now the core Operations for Array and Structs have a different internal representation, so they can distringuish between
- the entire value is NULL
- the contained values are NULL

Before, ops.Array held onto a `VarTuple[Value]`. So the contained Values could be NULL, but there was no way to say the entire thing was null. Now, ops.Array stores a `None | VarTuple[Value]`. The same thing for ops.StructValue. ops.Map didn't suffer from this, because it stores a `ops.Array`s internally, so since `ops.Array` can distinguish between entirely-NULL and contains-NULL, so can ops.Map

A fallout of this is that ops.Array needs a way to explicitly store its dtype. Before, it derived its dtype based on the dtype of its args. But
now that `None` is a valid value, it is now possible for there to be no values to inspect! So the Op actually stores its dtype explicitly. If you pass in values, then supplying the dtype on construction is optional, we go back to the old behavior of deriving it from the inputs.

This requires the backend compilers to now deal with that case.

Several of the backends were always broken here, they just weren't getting caught. I marked them as broken,
we can fix them in a followup.

You can test this locally with eg
`pytest -m <backend> -k factory ibis/backends/tests/test_array.py  ibis/backends/tests/test_map.py ibis/backends/tests/test_struct.py`

Also, fix a typing bug: map() can accept ArrayValues, not just ArrayColumns

Also, fix executing NULL arrays on pandas.

Also, fixing converting dtypes on clickhouse, Structs should be converted to nonnullable dtypes.

Also, fix casting structs on pandas.
See ibis-project#8687

Also, support passing in None to all these constructors.

Also, error when the value type can't be inferred from empty python literals
(eg what is the value type for the elements of []?)

Also, make the type argument for struct() always have an effect, not just when passing in python literals.
So basically it can act like a cast.

Also, make these constructors idempotent.
NickCrews added a commit to NickCrews/ibis that referenced this issue May 9, 2024
fixes ibis-project#8289

This does a lot of changes. It was hard for me to separate them out as I implemented them. But now that it's all hashed out, I can try to split this up into separate commits if you want. But that might be sorta hard in
some cases.

One this is adding support for passing in None to all these constructors.
These use the new `ibis.null(<type>)` API to return `op.Literal(None, <type>)`s

Make these constructors idempotent: you can
pass in existing Expressions into array(), etc.
The type argument for all of these now always has an effect, not just when passing in python literals. So basically it acts like a cast.

A big structural change is that now ops.Array has an optional
attribute "dtype", so if you pass in a 0-length sequence
of values the op still knows what dtype it is.

Several of the backends were always broken here, they just weren't getting caught. I marked them as broken, we can fix them in a followup.

You can test this locally with eg
`pytest -m <backend> -k factory ibis/backends/tests/test_array.py  ibis/backends/tests/test_map.py ibis/backends/tests/test_struct.py`

Also, fix a typing bug: map() can accept ArrayValues, not just ArrayColumns.

Also, fix executing Literal(None) on pandas and polars, 0-length arrays on polars

Also, fixing converting dtypes on clickhouse, Structs should be converted to nonnullable dtypes.

Also, implement ops.StructColumn on pandas and dask
NickCrews added a commit to NickCrews/ibis that referenced this issue May 11, 2024
fixes ibis-project#8289

This does a lot of changes. It was hard for me to separate them out as I implemented them. But now that it's all hashed out, I can try to split this up into separate commits if you want. But that might be sorta hard in
some cases.

One this is adding support for passing in None to all these constructors.
These use the new `ibis.null(<type>)` API to return `op.Literal(None, <type>)`s

Make these constructors idempotent: you can
pass in existing Expressions into array(), etc.
The type argument for all of these now always has an effect, not just when passing in python literals. So basically it acts like a cast.

A big structural change is that now ops.Array has an optional
attribute "dtype", so if you pass in a 0-length sequence
of values the op still knows what dtype it is.

Several of the backends were always broken here, they just weren't getting caught. I marked them as broken, we can fix them in a followup.

You can test this locally with eg
`pytest -m <backend> -k factory ibis/backends/tests/test_array.py  ibis/backends/tests/test_map.py ibis/backends/tests/test_struct.py`

Also, fix a typing bug: map() can accept ArrayValues, not just ArrayColumns.

Also, fix executing Literal(None) on pandas and polars, 0-length arrays on polars

Also, fixing converting dtypes on clickhouse, Structs should be converted to nonnullable dtypes.

Also, implement ops.StructColumn on pandas and dask
NickCrews added a commit to NickCrews/ibis that referenced this issue May 11, 2024
fixes ibis-project#8289

This does a lot of changes. It was hard for me to separate them out as I implemented them. But now that it's all hashed out, I can try to split this up into separate commits if you want. But that might be sorta hard in
some cases.

One this is adding support for passing in None to all these constructors.
These use the new `ibis.null(<type>)` API to return `op.Literal(None, <type>)`s

Make these constructors idempotent: you can
pass in existing Expressions into array(), etc.
The type argument for all of these now always has an effect, not just when passing in python literals. So basically it acts like a cast.

A big structural change is that now ops.Array has an optional
attribute "dtype", so if you pass in a 0-length sequence
of values the op still knows what dtype it is.

Several of the backends were always broken here, they just weren't getting caught. I marked them as broken, we can fix them in a followup.

You can test this locally with eg
`pytest -m <backend> -k factory ibis/backends/tests/test_array.py  ibis/backends/tests/test_map.py ibis/backends/tests/test_struct.py`

Also, fix a typing bug: map() can accept ArrayValues, not just ArrayColumns.

Also, fix executing Literal(None) on pandas and polars, 0-length arrays on polars

Also, fixing converting dtypes on clickhouse, Structs should be converted to nonnullable dtypes.

Also, implement ops.StructColumn on pandas and dask
NickCrews added a commit to NickCrews/ibis that referenced this issue May 11, 2024
fixes ibis-project#8289

This does a lot of changes. It was hard for me to separate them out as I implemented them. But now that it's all hashed out, I can try to split this up into separate commits if you want. But that might be sorta hard in
some cases.

One this is adding support for passing in None to all these constructors.
These use the new `ibis.null(<type>)` API to return `op.Literal(None, <type>)`s

Make these constructors idempotent: you can
pass in existing Expressions into array(), etc.
The type argument for all of these now always has an effect, not just when passing in python literals. So basically it acts like a cast.

A big structural change is that now ops.Array has an optional
attribute "dtype", so if you pass in a 0-length sequence
of values the op still knows what dtype it is.

Several of the backends were always broken here, they just weren't getting caught. I marked them as broken, we can fix them in a followup.

You can test this locally with eg
`pytest -m <backend> -k factory ibis/backends/tests/test_array.py  ibis/backends/tests/test_map.py ibis/backends/tests/test_struct.py`

Also, fix a typing bug: map() can accept ArrayValues, not just ArrayColumns.

Also, fix executing Literal(None) on pandas and polars, 0-length arrays on polars

Also, fixing converting dtypes on clickhouse, Structs should be converted to nonnullable dtypes.

Also, implement ops.StructColumn on pandas and dask
NickCrews added a commit to NickCrews/ibis that referenced this issue May 12, 2024
fixes ibis-project#8289

This does a lot of changes. It was hard for me to separate them out as I implemented them. But now that it's all hashed out, I can try to split this up into separate commits if you want. But that might be sorta hard in
some cases.

One this is adding support for passing in None to all these constructors.
These use the new `ibis.null(<type>)` API to return `op.Literal(None, <type>)`s

Make these constructors idempotent: you can
pass in existing Expressions into array(), etc.
The type argument for all of these now always has an effect, not just when passing in python literals. So basically it acts like a cast.

A big structural change is that now ops.Array has an optional
attribute "dtype", so if you pass in a 0-length sequence
of values the op still knows what dtype it is.

Several of the backends were always broken here, they just weren't getting caught. I marked them as broken, we can fix them in a followup.

You can test this locally with eg
`pytest -m <backend> -k factory ibis/backends/tests/test_array.py  ibis/backends/tests/test_map.py ibis/backends/tests/test_struct.py`

Also, fix a typing bug: map() can accept ArrayValues, not just ArrayColumns.

Also, fix executing Literal(None) on pandas and polars, 0-length arrays on polars

Also, fixing converting dtypes on clickhouse, Structs should be converted to nonnullable dtypes.

Also, implement ops.StructColumn on pandas and dask
NickCrews added a commit to NickCrews/ibis that referenced this issue May 12, 2024
fixes ibis-project#8289

This does a lot of changes. It was hard for me to separate them out as I implemented them. But now that it's all hashed out, I can try to split this up into separate commits if you want. But that might be sorta hard in
some cases.

One this is adding support for passing in None to all these constructors.
These use the new `ibis.null(<type>)` API to return `op.Literal(None, <type>)`s

Make these constructors idempotent: you can
pass in existing Expressions into array(), etc.
The type argument for all of these now always has an effect, not just when passing in python literals. So basically it acts like a cast.

A big structural change is that now ops.Array has an optional
attribute "dtype", so if you pass in a 0-length sequence
of values the op still knows what dtype it is.

Several of the backends were always broken here, they just weren't getting caught. I marked them as broken, we can fix them in a followup.

You can test this locally with eg
`pytest -m <backend> -k factory ibis/backends/tests/test_array.py  ibis/backends/tests/test_map.py ibis/backends/tests/test_struct.py`

Also, fix a typing bug: map() can accept ArrayValues, not just ArrayColumns.

Also, fix executing Literal(None) on pandas and polars, 0-length arrays on polars

Also, fixing converting dtypes on clickhouse, Structs should be converted to nonnullable dtypes.

Also, implement ops.StructColumn on pandas and dask
NickCrews added a commit to NickCrews/ibis that referenced this issue May 14, 2024
fixes ibis-project#8289

This does a lot of changes. It was hard for me to separate them out as I implemented them. But now that it's all hashed out, I can try to split this up into separate commits if you want. But that might be sorta hard in
some cases.

One this is adding support for passing in None to all these constructors.
These use the new `ibis.null(<type>)` API to return `op.Literal(None, <type>)`s

Make these constructors idempotent: you can
pass in existing Expressions into array(), etc.
The type argument for all of these now always has an effect, not just when passing in python literals. So basically it acts like a cast.

A big structural change is that now ops.Array has an optional
attribute "dtype", so if you pass in a 0-length sequence
of values the op still knows what dtype it is.

Several of the backends were always broken here, they just weren't getting caught. I marked them as broken, we can fix them in a followup.

You can test this locally with eg
`pytest -m <backend> -k factory ibis/backends/tests/test_array.py  ibis/backends/tests/test_map.py ibis/backends/tests/test_struct.py`

Also, fix a typing bug: map() can accept ArrayValues, not just ArrayColumns.

Also, fix executing Literal(None) on pandas and polars, 0-length arrays on polars

Also, fixing converting dtypes on clickhouse, Structs should be converted to nonnullable dtypes.

Also, implement ops.StructColumn on pandas and dask
NickCrews added a commit to NickCrews/ibis that referenced this issue May 14, 2024
fixes ibis-project#8289

This does a lot of changes. It was hard for me to separate them out as I implemented them. But now that it's all hashed out, I can try to split this up into separate commits if you want. But that might be sorta hard in
some cases.

One this is adding support for passing in None to all these constructors.
These use the new `ibis.null(<type>)` API to return `op.Literal(None, <type>)`s

Make these constructors idempotent: you can
pass in existing Expressions into array(), etc.
The type argument for all of these now always has an effect, not just when passing in python literals. So basically it acts like a cast.

A big structural change is that now ops.Array has an optional
attribute "dtype", so if you pass in a 0-length sequence
of values the op still knows what dtype it is.

Several of the backends were always broken here, they just weren't getting caught. I marked them as broken, we can fix them in a followup.

You can test this locally with eg
`pytest -m <backend> -k factory ibis/backends/tests/test_array.py  ibis/backends/tests/test_map.py ibis/backends/tests/test_struct.py`

Also, fix a typing bug: map() can accept ArrayValues, not just ArrayColumns.

Also, fix executing Literal(None) on pandas and polars, 0-length arrays on polars

Also, fixing converting dtypes on clickhouse, Structs should be converted to nonnullable dtypes.

Also, implement ops.StructColumn on pandas and dask
NickCrews added a commit to NickCrews/ibis that referenced this issue May 14, 2024
fixes ibis-project#8289

This does a lot of changes. It was hard for me to separate them out as I implemented them. But now that it's all hashed out, I can try to split this up into separate commits if you want. But that might be sorta hard in
some cases.

One this is adding support for passing in None to all these constructors.
These use the new `ibis.null(<type>)` API to return `op.Literal(None, <type>)`s

Make these constructors idempotent: you can
pass in existing Expressions into array(), etc.
The type argument for all of these now always has an effect, not just when passing in python literals. So basically it acts like a cast.

A big structural change is that now ops.Array has an optional
attribute "dtype", so if you pass in a 0-length sequence
of values the op still knows what dtype it is.

Several of the backends were always broken here, they just weren't getting caught. I marked them as broken, we can fix them in a followup.

You can test this locally with eg
`pytest -m <backend> -k factory ibis/backends/tests/test_array.py  ibis/backends/tests/test_map.py ibis/backends/tests/test_struct.py`

Also, fix a typing bug: map() can accept ArrayValues, not just ArrayColumns.

Also, fix executing Literal(None) on pandas and polars, 0-length arrays on polars

Also, fixing converting dtypes on clickhouse, Structs should be converted to nonnullable dtypes.

Also, implement ops.StructColumn on pandas and dask
NickCrews added a commit to NickCrews/ibis that referenced this issue May 15, 2024
fixes ibis-project#8289

This does a lot of changes. It was hard for me to separate them out as I implemented them. But now that it's all hashed out, I can try to split this up into separate commits if you want. But that might be sorta hard in
some cases.

One this is adding support for passing in None to all these constructors.
These use the new `ibis.null(<type>)` API to return `op.Literal(None, <type>)`s

Make these constructors idempotent: you can
pass in existing Expressions into array(), etc.
The type argument for all of these now always has an effect, not just when passing in python literals. So basically it acts like a cast.

A big structural change is that now ops.Array has an optional
attribute "dtype", so if you pass in a 0-length sequence
of values the op still knows what dtype it is.

Several of the backends were always broken here, they just weren't getting caught. I marked them as broken, we can fix them in a followup.

You can test this locally with eg
`pytest -m <backend> -k factory ibis/backends/tests/test_array.py  ibis/backends/tests/test_map.py ibis/backends/tests/test_struct.py`

Also, fix a typing bug: map() can accept ArrayValues, not just ArrayColumns.

Also, fix executing Literal(None) on pandas and polars, 0-length arrays on polars

Also, fixing converting dtypes on clickhouse, Structs should be converted to nonnullable dtypes.

Also, implement ops.StructColumn on pandas and dask
NickCrews added a commit to NickCrews/ibis that referenced this issue May 15, 2024
fixes ibis-project#8289

This does a lot of changes. It was hard for me to separate them out as I implemented them. But now that it's all hashed out, I can try to split this up into separate commits if you want. But that might be sorta hard in
some cases.

One this is adding support for passing in None to all these constructors.
These use the new `ibis.null(<type>)` API to return `op.Literal(None, <type>)`s

Make these constructors idempotent: you can
pass in existing Expressions into array(), etc.
The type argument for all of these now always has an effect, not just when passing in python literals. So basically it acts like a cast.

A big structural change is that now ops.Array has an optional
attribute "dtype", so if you pass in a 0-length sequence
of values the op still knows what dtype it is.

Several of the backends were always broken here, they just weren't getting caught. I marked them as broken, we can fix them in a followup.

You can test this locally with eg
`pytest -m <backend> -k factory ibis/backends/tests/test_array.py  ibis/backends/tests/test_map.py ibis/backends/tests/test_struct.py`

Also, fix a typing bug: map() can accept ArrayValues, not just ArrayColumns.

Also, fix executing Literal(None) on pandas and polars, 0-length arrays on polars

Also, fixing converting dtypes on clickhouse, Structs should be converted to nonnullable dtypes.

Also, implement ops.StructColumn on pandas and dask
NickCrews added a commit to NickCrews/ibis that referenced this issue Jun 4, 2024
fixes ibis-project#8289

This does a lot of changes. It was hard for me to separate them out as I implemented them. But now that it's all hashed out, I can try to split this up into separate commits if you want. But that might be sorta hard in
some cases.

One this is adding support for passing in None to all these constructors.
These use the new `ibis.null(<type>)` API to return `op.Literal(None, <type>)`s

Make these constructors idempotent: you can
pass in existing Expressions into array(), etc.
The type argument for all of these now always has an effect, not just when passing in python literals. So basically it acts like a cast.

A big structural change is that now ops.Array has an optional
attribute "dtype", so if you pass in a 0-length sequence
of values the op still knows what dtype it is.

Several of the backends were always broken here, they just weren't getting caught. I marked them as broken, we can fix them in a followup.

You can test this locally with eg
`pytest -m <backend> -k factory ibis/backends/tests/test_array.py  ibis/backends/tests/test_map.py ibis/backends/tests/test_struct.py`

Also, fix a typing bug: map() can accept ArrayValues, not just ArrayColumns.

Also, fix executing Literal(None) on pandas and polars, 0-length arrays on polars

Also, fixing converting dtypes on clickhouse, Structs should be converted to nonnullable dtypes.

Also, implement ops.StructColumn on pandas and dask
NickCrews added a commit to NickCrews/ibis that referenced this issue Jun 4, 2024
fixes ibis-project#8289

This does a lot of changes. It was hard for me to separate them out as I implemented them. But now that it's all hashed out, I can try to split this up into separate commits if you want. But that might be sorta hard in
some cases.

One this is adding support for passing in None to all these constructors.
These use the new `ibis.null(<type>)` API to return `op.Literal(None, <type>)`s

Make these constructors idempotent: you can
pass in existing Expressions into array(), etc.
The type argument for all of these now always has an effect, not just when passing in python literals. So basically it acts like a cast.

A big structural change is that now ops.Array has an optional
attribute "dtype", so if you pass in a 0-length sequence
of values the op still knows what dtype it is.

Several of the backends were always broken here, they just weren't getting caught. I marked them as broken, we can fix them in a followup.

You can test this locally with eg
`pytest -m <backend> -k factory ibis/backends/tests/test_array.py  ibis/backends/tests/test_map.py ibis/backends/tests/test_struct.py`

Also, fix executing Literal(None) on pandas and polars, 0-length arrays on polars

Also, fixing converting dtypes on clickhouse, Structs should be converted to nonnullable dtypes.
NickCrews added a commit to NickCrews/ibis that referenced this issue Jun 4, 2024
fixes ibis-project#8289

This does a lot of changes. It was hard for me to separate them out as I implemented them. But now that it's all hashed out, I can try to split this up into separate commits if you want. But that might be sorta hard in
some cases.

One this is adding support for passing in None to all these constructors.
These use the new `ibis.null(<type>)` API to return `op.Literal(None, <type>)`s

Make these constructors idempotent: you can
pass in existing Expressions into array(), etc.
The type argument for all of these now always has an effect, not just when passing in python literals. So basically it acts like a cast.

A big structural change is that now ops.Array has an optional
attribute "dtype", so if you pass in a 0-length sequence
of values the op still knows what dtype it is.

Several of the backends were always broken here, they just weren't getting caught. I marked them as broken, we can fix them in a followup.

You can test this locally with eg
`pytest -m <backend> -k factory ibis/backends/tests/test_array.py  ibis/backends/tests/test_map.py ibis/backends/tests/test_struct.py`

Also, fix executing Literal(None) on pandas and polars, 0-length arrays on polars

Also, fixing converting dtypes on clickhouse, Structs should be converted to nonnullable dtypes.
NickCrews added a commit to NickCrews/ibis that referenced this issue Jun 4, 2024
fixes ibis-project#8289

This does a lot of changes. It was hard for me to separate them out as I implemented them. But now that it's all hashed out, I can try to split this up into separate commits if you want. But that might be sorta hard in
some cases.

One this is adding support for passing in None to all these constructors.
These use the new `ibis.null(<type>)` API to return `op.Literal(None, <type>)`s

Make these constructors idempotent: you can
pass in existing Expressions into array(), etc.
The type argument for all of these now always has an effect, not just when passing in python literals. So basically it acts like a cast.

A big structural change is that now ops.Array has an optional
attribute "dtype", so if you pass in a 0-length sequence
of values the op still knows what dtype it is.

Several of the backends were always broken here, they just weren't getting caught. I marked them as broken, we can fix them in a followup.

You can test this locally with eg
`pytest -m <backend> -k factory ibis/backends/tests/test_array.py  ibis/backends/tests/test_map.py ibis/backends/tests/test_struct.py`

Also, fix executing Literal(None) on pandas and polars, 0-length arrays on polars

Also, fixing converting dtypes on clickhouse, Structs should be converted to nonnullable dtypes.
NickCrews added a commit to NickCrews/ibis that referenced this issue Jun 4, 2024
fixes ibis-project#8289

This does a lot of changes. It was hard for me to separate them out as I implemented them. But now that it's all hashed out, I can try to split this up into separate commits if you want. But that might be sorta hard in
some cases.

One this is adding support for passing in None to all these constructors.
These use the new `ibis.null(<type>)` API to return `op.Literal(None, <type>)`s

Make these constructors idempotent: you can
pass in existing Expressions into array(), etc.
The type argument for all of these now always has an effect, not just when passing in python literals. So basically it acts like a cast.

A big structural change is that now ops.Array has an optional
attribute "dtype", so if you pass in a 0-length sequence
of values the op still knows what dtype it is.

Several of the backends were always broken here, they just weren't getting caught. I marked them as broken, we can fix them in a followup.

You can test this locally with eg
`pytest -m <backend> -k factory ibis/backends/tests/test_array.py  ibis/backends/tests/test_map.py ibis/backends/tests/test_struct.py`

Also, fix executing Literal(None) on pandas and polars, 0-length arrays on polars

Also, fixing converting dtypes on clickhouse, Structs should be converted to nonnullable dtypes.
NickCrews added a commit to NickCrews/ibis that referenced this issue Jun 4, 2024
working towards ibis-project#8289

I'm not sure how useful empty structs are, since it seems like
not many backends actually support them.
But still, if you stay in ibis-land, perhaps it is useful.
Not that hard for us to support it, so why not.

I'm not sure of the history of the specific disallowment
that I am removing from the type inference.

Relevant context:

- ibis-project#8876
- https://github.com/ibis-project/ibis/issues?q=empty+struct
NickCrews added a commit to NickCrews/ibis that referenced this issue Jun 4, 2024
working towards ibis-project#8289

I'm not sure how useful empty structs are, since it seems like
only bigquery, dask, and pandas actually support them.
But still, if you stay in ibis-land, perhaps it is useful.
ie for doing type manipulations, or maybe you
only use them for intermediate calculations?
Not that hard for us to support it, so why not.

I'm not sure of the history of the specific disallowment
that I am removing from the type inference.

Relevant context:

- ibis-project#8876
- https://github.com/ibis-project/ibis/issues?q=empty+struct
NickCrews added a commit to NickCrews/ibis that referenced this issue Jun 4, 2024
working towards ibis-project#8289

I'm not sure how useful empty structs are, since it seems like
only bigquery, dask, and pandas actually support them.
But still, if you stay in ibis-land, perhaps it is useful.
ie for doing type manipulations, or maybe you
only use them for intermediate calculations?
Not that hard for us to support it, so why not.

I'm not sure of the history of the specific disallowment
that I am removing from the type inference.

Relevant context:

- ibis-project#8876
- https://github.com/ibis-project/ibis/issues?q=empty+struct
NickCrews added a commit to NickCrews/ibis that referenced this issue Jun 7, 2024
fixes ibis-project#8289

This does a lot of changes. It was hard for me to separate them out as I implemented them. But now that it's all hashed out, I can try to split this up into separate commits if you want. But that might be sorta hard in
some cases.

One this is adding support for passing in None to all these constructors.
These use the new `ibis.null(<type>)` API to return `op.Literal(None, <type>)`s

Make these constructors idempotent: you can
pass in existing Expressions into array(), etc.
The type argument for all of these now always has an effect, not just when passing in python literals. So basically it acts like a cast.

A big structural change is that now ops.Array has an optional
attribute "dtype", so if you pass in a 0-length sequence
of values the op still knows what dtype it is.

Several of the backends were always broken here, they just weren't getting caught. I marked them as broken, we can fix them in a followup.

You can test this locally with eg
`pytest -m <backend> -k factory ibis/backends/tests/test_array.py  ibis/backends/tests/test_map.py ibis/backends/tests/test_struct.py`

Also, fix executing Literal(None) on pandas and polars, 0-length arrays on polars

Also, fixing converting dtypes on clickhouse, Structs should be converted to nonnullable dtypes.
NickCrews added a commit to NickCrews/ibis that referenced this issue Jun 27, 2024
works towards ibis-project#8289

One this is adding support for passing in None.
These use the new `ibis.null(<type>)` API to return `op.Literal(None, <type>)`s

Also, now map() is idempotent: you can
pass in existing Expressions into map().
The type argument for all of these now always has an effect, not just when passing in python literals. So basically it acts like a cast if the argument is already an expression.

You can test this locally with eg
`pytest -m <backend> -k factory   ibis/backends/tests/test_map.py`
NickCrews added a commit to NickCrews/ibis that referenced this issue Jun 27, 2024
Part of ibis-project#8289

Depends on the array API getting updated
@ncclementi
Copy link
Contributor

@NickCrews I noticed there are multiple PRs open, draft or closed that were aiming to tackle this issue, but it's unclear what's the status. Would it be possible to leave a comment here on what are the steps needed to get this one across the line?

@NickCrews
Copy link
Contributor Author

#9473 needs to be resolved for the array changes (which really is getting stalled by larger philosophical decisions that this PR is exposing), and then the linked map PR can be resumed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Features or general enhancements
Projects
Status: todo
Development

Successfully merging a pull request may close this issue.

3 participants