Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can not write array and dgeMatrix objects #198

Open
ToryDeng opened this issue Nov 13, 2024 · 4 comments
Open

Can not write array and dgeMatrix objects #198

ToryDeng opened this issue Nov 13, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@ToryDeng
Copy link

I am unable to write objects of type array and dgeMatrix to .h5ad files. Here is an example R script demonstrating this issue:

adata <- generate_dataset(format = "AnnData")
adata$uns <- list(arr=array(c(1:30),dim = c(3,2,5)))
write_h5ad(adata, "test_adata.h5ad")
warnings()
Warning messages:
1: In value[[3L]](cond) :
  Could not write element 'layers/numeric_dense' of type 'dgeMatrix':
unknown type
2: In value[[3L]](cond) :
  Could not write element 'layers/numeric_dense_with_nas' of type 'dgeMatrix':
unknown type
3: In value[[3L]](cond) :
  Could not write element 'layers/integer_dense' of type 'dgeMatrix':
unknown type
4: In value[[3L]](cond) :
  Could not write element 'layers/integer_dense_with_nas' of type 'dgeMatrix':
unknown type
5: In value[[3L]](cond) :
  Could not write element 'obsm/numeric_dense' of type 'dgeMatrix':
unknown type
6: In value[[3L]](cond) :
  Could not write element 'obsm/numeric_dense_with_nas' of type 'dgeMatrix':
unknown type
7: In value[[3L]](cond) :
  Could not write element 'obsm/integer_dense' of type 'dgeMatrix':
unknown type
8: In value[[3L]](cond) :
  Could not write element 'obsm/integer_dense_with_nas' of type 'dgeMatrix':
unknown type
9: In value[[3L]](cond) :
  Could not write element 'varm/numeric_dense' of type 'dgeMatrix':
unknown type
10: In value[[3L]](cond) :
  Could not write element 'varm/numeric_dense_with_nas' of type 'dgeMatrix':
unknown type
11: In value[[3L]](cond) :
  Could not write element 'varm/integer_dense' of type 'dgeMatrix':
unknown type
12: In value[[3L]](cond) :
  Could not write element 'varm/integer_dense_with_nas' of type 'dgeMatrix':
unknown type
13: In value[[3L]](cond) :
  Could not write element 'obsp/numeric_dense' of type 'dgeMatrix':
unknown type
14: In value[[3L]](cond) :
  Could not write element 'obsp/numeric_dense_with_nas' of type 'dgeMatrix':
unknown type
15: In value[[3L]](cond) :
  Could not write element 'obsp/integer_dense' of type 'dgeMatrix':
unknown type
16: In value[[3L]](cond) :
  Could not write element 'obsp/integer_dense_with_nas' of type 'dgeMatrix':
unknown type
17: In value[[3L]](cond) :
  Could not write element 'varp/numeric_dense' of type 'dgeMatrix':
unknown type
18: In value[[3L]](cond) :
  Could not write element 'varp/numeric_dense_with_nas' of type 'dgeMatrix':
unknown type
19: In value[[3L]](cond) :
  Could not write element 'varp/integer_dense' of type 'dgeMatrix':
unknown type
20: In value[[3L]](cond) :
  Could not write element 'varp/integer_dense_with_nas' of type 'dgeMatrix':
unknown type
21: In value[[3L]](cond) : Could not write element 'uns/arr' of type 'array':
argument is not a matrix

Is there a way to resolve this issue, or is this functionality not yet implemented in anndataR?

@lazappi lazappi added the bug Something isn't working label Nov 21, 2024
@lazappi
Copy link
Collaborator

lazappi commented Nov 21, 2024

We don't currently have support for either of these formats. Maybe we should consider it but neither of them is currently used. I'm not sure that generate_dataset() should output dgeMatrix objects rather than normal dense matrices and I'm a bit surprised it hasn't caused issues before now. Possibly something changed in {Matrix} or another package?

I'm less sure about supporting array as I'm not sure the H5AD specification has a definition for how to store n-dimensional arrays.

@LouiseDck @rcannood Do you have any other thoughts?

@LouiseDck
Copy link
Collaborator

I was running into these dgeMatrix issues as well and was trying to figure out when you would want to use them instead of normal dense matrices? If I understand, you just wouldn't?

As far as I could figure out, there's no real difference between a matrix and an array object?

The generate_dataset() functionality is also there to systematically evaluate (in the tests I'm writing atm) which R datatypes we should write to h5ad and how they compare to the Python datatypes. One of the goals of this is to have a list of supported an unsupported datatypes.

@lazappi
Copy link
Collaborator

lazappi commented Nov 27, 2024

I was running into these dgeMatrix issues as well and was trying to figure out when you would want to use them instead of normal dense matrices? If I understand, you just wouldn't?

Yeah, I don't think there is any reason to use a dgeMatrix, I wasn't even aware they existed until now. If the default arguments to the dummy function create them then I think they should be handled in a better way, otherwise users will get confused why things are failing.

As far as I could figure out, there's no real difference between a matrix and an array object?

A matrix is always 2D but an array is n-dimensional. In most cases a matrix is what you want but #204 was just opened so array probably needs to be supported as well (for slots where dimensions != 2 are allowed).

@rcannood
Copy link
Collaborator

Let's fix this. It's true that a regular matrix could be used instead of a dgeMatrix, but a dgeMatrix is still a thing so HDF5AnnData should not flip out when we pass a dgeMatrix to it ^^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants
@rcannood @lazappi @LouiseDck @ToryDeng and others