Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix is_categorical_dtype warning #1099

Merged
merged 17 commits into from
Aug 24, 2023
Merged

Fix is_categorical_dtype warning #1099

merged 17 commits into from
Aug 24, 2023

Conversation

flying-sheep
Copy link
Member

@flying-sheep flying-sheep commented Aug 18, 2023

No description provided.

@codecov
Copy link

codecov bot commented Aug 18, 2023

Codecov Report

Merging #1099 (56fe903) into main (b19a5ea) will decrease coverage by 2.12%.
Report is 4 commits behind head on main.
The diff coverage is 92.00%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1099      +/-   ##
==========================================
- Coverage   84.79%   82.67%   -2.12%     
==========================================
  Files          35       35              
  Lines        5090     5098       +8     
==========================================
- Hits         4316     4215     -101     
- Misses        774      883     +109     
Flag Coverage Δ
gpu-tests ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
anndata/_core/aligned_mapping.py 93.66% <71.42%> (-0.76%) ⬇️
anndata/_core/anndata.py 82.79% <100.00%> (+0.02%) ⬆️
anndata/_core/merge.py 82.68% <100.00%> (-11.41%) ⬇️

... and 5 files with indirect coverage changes

@flying-sheep flying-sheep changed the title Fix index comparison Fix pandas 2.1 compatibility Aug 18, 2023
@flying-sheep flying-sheep marked this pull request as ready for review August 21, 2023 11:47
@flying-sheep flying-sheep added this to the 0.9.3 milestone Aug 21, 2023
@flying-sheep flying-sheep changed the title Fix pandas 2.1 compatibility Fix is_categorical_dtype warning Aug 21, 2023
@flying-sheep flying-sheep modified the milestones: 0.9.3, 0.10.0 Aug 21, 2023
anndata/_core/aligned_mapping.py Outdated Show resolved Hide resolved
anndata/_core/merge.py Outdated Show resolved Hide resolved
anndata/_core/merge.py Outdated Show resolved Hide resolved
@flying-sheep
Copy link
Member Author

@ivirshup all done

@ivirshup
Copy link
Member

About the extra change

Because it would have saved me from jumping into the debugger to see how the indices differ.

It looks like this will only be the case for small data, since afaict the error message from the assertion is just dumping the repr of each object into the exception message:

In [30]: adata.obsm["b"] = b.to_df()
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[30], line 1
----> 1 adata.obsm["b"] = b.to_df()

File ~/github/anndata/anndata/_core/aligned_mapping.py:189, in AlignedActualMixin.__setitem__(self, key, value)
    188 def __setitem__(self, key: str, value: V):
--> 189     value = self._validate_value(value, key)
    190     self._data[key] = value

File ~/github/anndata/anndata/_core/aligned_mapping.py:254, in AxisArraysBase._validate_value(self, val, key)
    252 except AssertionError as e:
    253     msg = f"value.index does not match parent’s axis {self.axes[0]} names:\n{e}"
--> 254     raise ValueError(msg) from None
    255 else:
    256     msg = "Index.equals and pd.testing.assert_index_equal disagree"

ValueError: value.index does not match parents axis 0 names:
Index are different

Index values are different (0.03791 %)
[left]:  Index(['AAACATACAACCAC-1', 'AAACATTGAGCTAC-1', 'AAACATTGATCAGC-1',
       'AAACCGTGCTTCCG-1', 'AAACCGTGTATGCG-1', 'AAACGCACTGGTAC-1',
       'AAACGCTGACCAGT-1', 'AAACGCTGGTTCTT-1', 'AAACGCTGTAGCCA-1',
       'AAACGCTGTTTCTG-1',
       ...
       'TTTCAGTGTCACGA-1', 'TTTCAGTGTCTATC-1', 'TTTCAGTGTGCAGT-1',
       'TTTCCAGAGGTGAG-1', 'TTTCGAACACCTGA-1', 'TTTCGAACTCTCAT-1',
       'TTTCTACTGAGGCA-1', 'TTTCTACTTCCTCG-1', 'TTTGCATGAGAGGC-1',
       'TTTGCATGCCTCAC-1'],
      dtype='object', length=2638)
[right]: Index(['AAACATACAACCAC-1', 'AAACATTGAGCTAC-1', 'AAACATTGATCAGC-1',
       'AAACCGTGCTTCCG-1', 'AAACCGTGTATGCG-1', 'AAACGCACTGGTAC-1',
       'AAACGCTGACCAGT-1', 'AAACGCTGGTTCTT-1', 'AAACGCTGTAGCCA-1',
       'AAACGCTGTTTCTG-1',
       ...
       'TTTCAGTGTCACGA-1', 'TTTCAGTGTCTATC-1', 'TTTCAGTGTGCAGT-1',
       'TTTCCAGAGGTGAG-1', 'TTTCGAACACCTGA-1', 'TTTCGAACTCTCAT-1',
       'TTTCTACTGAGGCA-1', 'TTTCTACTTCCTCG-1', 'TTTGCATGAGAGGC-1',
       'TTTGCATGCCTCAC-1'],
      dtype='object', name='index', length=2638)

^ In this case the mismatched values are in the middle, and it's a relatively small dataset.

I'm fine to keep this for now, but also fine to remove it later without needing to keep the same level of detail.

@ivirshup
Copy link
Member

ivirshup commented Aug 22, 2023

Oh, also should this be backported? It's tagged as 0.10, but seems like it could be fine for 0.9.3 (not that I suspect we'll be making that release)

@ivirshup
Copy link
Member

But do feel free to merge. The points above are for your consideration, but are not blocking

@flying-sheep flying-sheep modified the milestones: 0.10.0, 0.9.3 Aug 24, 2023
@flying-sheep flying-sheep merged commit 5428f16 into main Aug 24, 2023
12 checks passed
@flying-sheep flying-sheep deleted the fix-pandas-2.1 branch August 24, 2023 14:58
@flying-sheep flying-sheep modified the milestones: 0.9.3, 0.10.0 Aug 25, 2023
@scverse scverse deleted a comment from lumberbot-app bot Aug 25, 2023
@flying-sheep
Copy link
Member Author

OK, doesn’t seem like this would merge anywhere near cleanly. Let’s not risk it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants