-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FSStore dimension separator bug workaround #43
Conversation
Thanks @sanath-2024 This should fix the problem I think but I need to run some manual tests since we don't have automated mock object store tests yet. There is also another potential solution that may be worth looking into, and would be cleaner if it works:
This way we don't break the self describing nature of Zarr. I've also mentioned this issue here: |
I think your solution is cleaner. Note that if the FSStore doesn't have a dimension separator it will default back to ".", making the filename ".zmetadata" - on the one hand this is cleaner and it is definitely what's supposed to happen; on the other hand it breaks the existing files that are called "zmetadata" or "/zmetadata". |
Created this issue with Zarr: zarr-developers/zarr-python#1121 In the meantime, @sanath-2024; this is what needs to be done:
These are the usages I could find, please take a deeper look. Let me know if you don't have time to work on this, I can make changes and commit here too. Please don't forget to run In tests: mdio-python/tests/unit/conftest.py Lines 104 to 109 in 509fdcb
mdio-python/tests/unit/conftest.py Line 127 in 509fdcb
mdio-python/tests/unit/conftest.py Lines 129 to 134 in 509fdcb
In SEG-Y conversion: mdio-python/src/mdio/segy/blocked_io.py Lines 81 to 87 in 811c105
mdio-python/src/mdio/segy/blocked_io.py Lines 110 to 116 in 811c105
mdio-python/src/mdio/converters/segy.py Lines 209 to 214 in e23ec72
|
closes #42 This PR will eliminate and backwards compatibility issues until Zarr v3 or we make major changes |
I can make those changes, and I'll make sure to run |
Patch to Avoid FSStore Bug
The Bug
Due to "overly aggressive canonicalization" in Zarr's FSStore implementation, all periods ('.') in a logical storage path are replaced with slashes ('/') when the dimension separator is a slash. One major consequence of this bug is that the
.zmetadata
file is implicitly renamed to/zmetadata
while writing and searching. This has no observable effects on the local filesystem on Linux, since the filesystem concatenates double slashes into a single slash. However, when the object is stored on a cloud object store (e.g. GCS), the double slash is not ignored. Additionally, even on the local filesystem, the concatenation of double slashes is technically an implementation-defined behavior and is therefore not safe. So, it makes sense to avoid the explicit or implicit use of double slashes.The Fix
The simplest way to avoid this bug is to avoid using '.' or '/' characters in the path to the metadata file. This change will not affect the ability to read existing MDIO files on the local FS, since the double slash was eliminated by the filesystem anyway. However, it will allow existing MDIO files to be properly read from object stores as well.