-
-
Notifications
You must be signed in to change notification settings - Fork 286
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Zarr-v3 Consolidated Metadata (#2113)
* Fixed MemoryStore.list_dir Ensures that nested children are listed properly. * fixup s3 * recursive Group.members This PR adds a recursive=True flag to Group.members, for recursively listing the members of some hierarhcy. This is useful for Consolidated Metadata, which needs to recursively inspect children. IMO, it's useful (and simple) enough to include in the public API. * Zarr-v3 Consolidated Metadata Implements the optional Consolidated Metadata feature of zarr-v3. * fixup * read zarr-v2 consolidated metadata * check writablem * Handle non-root paths * Some error handling * cleanup * refactor open * remove dupe file * v2 getitem * fixup * Optimzied members * Impl flatten * Fixups * doc * nest the tests * fixup * Fixups * fixup * fixup * fixup * fixup * consistent open_consolidated handling * fixup * make clear that flat_to_nested mutates * fixup * fixup * Fixup * fixup * fixup * fixup * fixup * added docs * fixup * Ensure empty dict * fixed name * fixup nested * removed dupe tests * fixup * doc fix * fixups * fixup * fixup * v2 writer * fixup * fixup * path fix * Fixed v2 use_consolidated=False * fixupg * Special case object dtype Closes #2315 * fixup * docs * pr review * must_understand * Updated from_dict checking * cleanup * cleanup * Fixed fill_value * fixup
- Loading branch information
1 parent
6b11bb8
commit 3964eab
Showing
14 changed files
with
1,732 additions
and
67 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,74 @@ | ||
Consolidated Metadata | ||
===================== | ||
|
||
Zarr-Python implements the `Consolidated Metadata_` extension to the Zarr Spec. | ||
Consolidated metadata can reduce the time needed to load the metadata for an | ||
entire hierarchy, especially when the metadata is being served over a network. | ||
Consolidated metadata essentially stores all the metadata for a hierarchy in the | ||
metadata of the root Group. | ||
|
||
Usage | ||
----- | ||
|
||
If consolidated metadata is present in a Zarr Group's metadata then it is used | ||
by default. The initial read to open the group will need to communicate with | ||
the store (reading from a file for a :class:`zarr.store.LocalStore`, making a | ||
network request for a :class:`zarr.store.RemoteStore`). After that, any subsequent | ||
metadata reads get child Group or Array nodes will *not* require reads from the store. | ||
|
||
In Python, the consolidated metadata is available on the ``.consolidated_metadata`` | ||
attribute of the ``GroupMetadata`` object. | ||
|
||
.. code-block:: python | ||
>>> import zarr | ||
>>> store = zarr.store.MemoryStore({}, mode="w") | ||
>>> group = zarr.open_group(store=store) | ||
>>> group.create_array(shape=(1,), name="a") | ||
>>> group.create_array(shape=(2, 2), name="b") | ||
>>> group.create_array(shape=(3, 3, 3), name="c") | ||
>>> zarr.consolidate_metadata(store) | ||
If we open that group, the Group's metadata has a :class:`zarr.ConsolidatedMetadata` | ||
that can be used. | ||
|
||
.. code-block:: python | ||
>>> consolidated = zarr.open_group(store=store) | ||
>>> consolidated.metadata.consolidated_metadata.metadata | ||
{'b': ArrayV3Metadata(shape=(2, 2), fill_value=np.float64(0.0), ...), | ||
'a': ArrayV3Metadata(shape=(1,), fill_value=np.float64(0.0), ...), | ||
'c': ArrayV3Metadata(shape=(3, 3, 3), fill_value=np.float64(0.0), ...)} | ||
Operations on the group to get children automatically use the consolidated metadata. | ||
|
||
.. code-block:: python | ||
>>> consolidated["a"] # no read / HTTP request to the Store is required | ||
<Array memory://.../a shape=(1,) dtype=float64> | ||
With nested groups, the consolidated metadata is available on the children, recursively. | ||
|
||
... code-block:: python | ||
|
||
>>> child = group.create_group("child", attributes={"kind": "child"}) | ||
>>> grandchild = child.create_group("child", attributes={"kind": "grandchild"}) | ||
>>> consolidated = zarr.consolidate_metadata(store) | ||
|
||
>>> consolidated["child"].metadata.consolidated_metadata | ||
ConsolidatedMetadata(metadata={'child': GroupMetadata(attributes={'kind': 'grandchild'}, zarr_format=3, )}, ...) | ||
|
||
Synchronization and Concurrency | ||
------------------------------- | ||
|
||
Consolidated metadata is intended for read-heavy use cases on slowly changing | ||
hierarchies. For hierarchies where new nodes are constantly being added, | ||
removed, or modified, consolidated metadata may not be desirable. | ||
|
||
1. It will add some overhead to each update operation, since the metadata | ||
would need to be re-consolidated to keep it in sync with the store. | ||
2. Readers using consolidated metadata will regularly see a "past" version | ||
of the metadata, at the time they read the root node with its consolidated | ||
metadata. | ||
|
||
.. _Consolidated Metadata: https://zarr-specs.readthedocs.io/en/latest/v3/core/v3.0.html#consolidated-metadata |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -10,6 +10,7 @@ Zarr-Python | |
|
||
getting_started | ||
tutorial | ||
consolidated_metadata | ||
api/index | ||
spec | ||
release | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.