Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect offsets for ListArray with invalid lists #2723

Closed
michaelvay opened this issue Aug 26, 2024 · 4 comments
Closed

Incorrect offsets for ListArray with invalid lists #2723

michaelvay opened this issue Aug 26, 2024 · 4 comments
Assignees

Comments

@michaelvay
Copy link
Contributor

Describe the bug
Getting a list from a ListArray that contains an invalid value beforehand returns wrong list value.
I think the issue is that the size of the invalid List is not expressed properly in the offsets resulting this issue.

To Reproduce
Steps to reproduce the behavior:

  1. Go to Daft/src/daft-core/src/array/ops/get.rs
  2. add this test and run
    fn test_list_get_some_valid() -> DaftResult<()> {
        let field = Field::new("foo", DataType::FixedSizeList(Box::new(DataType::Int32), 3));
        let flat_child = Int32Array::from(("foo", (0..9).collect::<Vec<i32>>()));
        let raw_validity = vec![true, false, true];
        let validity = Some(arrow2::bitmap::Bitmap::from(raw_validity.as_slice()));
        let arr = FixedSizeListArray::new(field, flat_child.into_series(), validity);
        let list_dtype = DataType::List(Box::new(DataType::Int32));
        let list_arr = arr.cast(&list_dtype)?;
        let l = list_arr.list()?;
        let element = l.get(0).unwrap();
        let element = element.i32()?;
        let data = element
            .into_iter()
            .map(|x| x.copied())
            .collect::<Vec<Option<i32>>>();
        let expected = vec![Some(0), Some(1), Some(2)];
        assert_eq!(data, expected);
        let element = l.get(2).unwrap();
        let element = element.i32()?;
        let data = element
            .into_iter()
            .map(|x| x.copied())
            .collect::<Vec<Option<i32>>>();
        let expected = vec![Some(6), Some(7), Some(8)];
        assert_eq!(data, expected);

        /// return [Some(3), Some(4), Some(5)] instead

Daft version: version = "0.3.0-dev0"
I aligned with main on this commit 20ffed4

@michaelvay michaelvay changed the title Inccorect offsets for ListArray with invalid lists Incorrect offsets for ListArray with invalid lists Aug 26, 2024
@universalmind303 universalmind303 self-assigned this Aug 26, 2024
@universalmind303
Copy link
Contributor

It looks like this isn't an issue with ListArray, but instead with the conversion between FixedSizeList and List.

        let field = Field::new("foo", DataType::FixedSizeList(Box::new(DataType::Int32), 3));
        let flat_child = Int32Array::from(("foo", (0..9).collect::<Vec<i32>>()));
        let raw_validity = vec![true, false, true];
        let validity = Some(arrow2::bitmap::Bitmap::from(raw_validity.as_slice()));
        let arr = FixedSizeListArray::new(field, flat_child.into_series(), validity);
        println!("{:?}", arr.to_arrow());
        // FixedSizeListArray[[0, 1, 2], None, [6, 7, 8]]
        let list_dtype = DataType::List(Box::new(DataType::Int32));
        let list_arr = arr.cast(&list_dtype)?;
        let l = list_arr.list()?;
        println!("{:?}", l.to_arrow());
        // LargeListArray[[0, 1, 2], None, [3, 4, 5]]

so it looks like the validity is not being applied correctly when downcasting to a list.

@michaelvay
Copy link
Contributor Author

Thanks for the quick reply. Correct, I misunderstood how offsets should be set for invalid values because I took inspiration from the problematic snippet in the casting logic. 🙏

@jaychia
Copy link
Contributor

jaychia commented Sep 1, 2024

@michaelvay I think #2729 should have fixed the issue -- I'll go ahead and close this but please do re-open the issue if it continues to be a problem after the next release!

@jaychia jaychia closed this as completed Sep 1, 2024
@michaelvay
Copy link
Contributor Author

@jaychia @universalmind303 Indeed solved thanks!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants
@jaychia @universalmind303 @michaelvay and others