-
-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
experiment with new dummy anndata #193
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing that was tested originally, was writing & reading and expecting the same results. I don't think this is tested with the current tests, and I think it is still a nice feature to test consistency of the writing and reading functions?
ad_new$obs[[name]], | ||
data$obs[[name]], | ||
adata_r$obs[[name]], | ||
adata_py$obs[[name]], | ||
ignore_attr = TRUE, | ||
tolerance = 1e-6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we probably should detect small data changes and signal them to the user instead of ignoring them, if there is a difference between the precision of certain datatypes
-- this is something that can happen at a later state, but we should keep it in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think inherent issues with floating point arithmetic are relevant to report on, but rather it's something users need to be generally aware of when using FPs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, if the tests in #170 catch the issues wrt issues like {anndataR} storing as float32 and {anndata} storing as float64
@lazappi do you have any suggestions for this PR? The purpose of the PR is to have a roundtrip test for anndataR which doesn't use reticulate's py_to_r or r_to_py (but does use reticulate to execute python code) |
I think the structure of it looks ok (some of the variables could have better names though 😸). From what I can see, this tests writing a file with Python and reading it into R, and writing and file with R and reading it in Python. I think that addresses @LouiseDck's comment? I guess you could make the object more complex to test more things but that could also be added later. Was there something more specific you wanted me to look at? |
That sums it up, indeed! Ideally, I'd directly check whether what R reads in is the same as what python had written, because we don't check whether the R matrix we get out of reading X (for instance) has the right values, only that we are able to reproduce the same data after we write it back out again. However, this approach is one way to be able to get reticulate out of the equation to make sure that the errors we observe are not due to reticulate conversions.
Wanted to make sure that this approach made sense to you :)
Other data structures (varm, etc) will be covered in a separate PR. |
@LouiseDck : To begin with, what would you think about the following as roundtrip tests? This would require the least amount of refactoring in comparison to the current unit tests, and would allow us to test our reading and writing capabilities.
WDYT?