Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/multiscale tests #25

Merged
merged 28 commits into from
Nov 22, 2024
Merged

Conversation

d-v-b
Copy link
Contributor

@d-v-b d-v-b commented Nov 22, 2024

I bring in a lot of the tests and validation logic from pydantic-zarr.
A few notable changes I add here:

  • I added .build methods to VectorScale and VectorTranslation that allow you to create the respective transform without explicitly passing in the type as a (redundant) keyword argument
  • I added a private _build_transforms function that creates tuple[VectorScale] | tuple[VectorScale, VectorTranslation], this is a convenience function for making coordinateTransformations metadata.
  • I add validation functions for transforms and axes, and the corresponding pydantic model type annotations

@d-v-b
Copy link
Contributor Author

d-v-b commented Nov 22, 2024

I added validation + tests to ensure that the number of multiscales.axes matches the dimensionality of the transforms defined in multiscales.coordinateTransformations (if present), and separate validation / tests to ensure that the axes match the number of transforms in the per-dataset coordinate transformations

@dstansby dstansby marked this pull request as draft November 22, 2024 10:40
@d-v-b d-v-b marked this pull request as ready for review November 22, 2024 12:26
@d-v-b
Copy link
Contributor Author

d-v-b commented Nov 22, 2024

i think this is ready for review!

@d-v-b d-v-b requested a review from dstansby November 22, 2024 12:26
@d-v-b
Copy link
Contributor Author

d-v-b commented Nov 22, 2024

this should resolve #33

pyproject.toml Outdated
@@ -3,8 +3,8 @@ name = "ome-zarr-models"
dynamic = ["version"]
description = "Data models for OME-Zarr"
readme = "README.md"
requires-python = ">=3.11"
dependencies = []
requires-python = ">=3.10"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we keep this as >=3.11, as per SPEC 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah right, this change needs to be reverted

scale: list[float]

@classmethod
def build(cls, data: Iterable[float]) -> Self:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we instead give type a defualt (type: Literal["scale"] = "scale"), and then if you want to construct one of these you can just do VectorScale(scale=[1, 2, 3])? I think that's slightly nicer API and avoids us having to write a build method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the argument against giving the type a default value was that this would allow a JSON object without a type field to pass validation, even though that JSON object is technically an invalid axis

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I totally agree that the API is nicer if the default is the de-facto default)

@@ -69,3 +92,32 @@ class PathTranslation(Base):

type: Literal["translation"]
translation: str


def ndim(transform: VectorScale | VectorTranslation) -> int:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the spirit of don't repeat yourself, I'm not mad about having ndim defined as both a method on the classes and a function. Was there a particular reason for doing this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the main reason to have a free-floating function is to support the niche case of using a functional idiom like map or filter on transformation objects, e.g. ndims = map(ndim, transforms), but maybe this is too niche :)

pyproject.toml Outdated Show resolved Hide resolved


def ensure_transform_dimensionality(
transforms: tuple[VectorScale | PathScale]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to extract the type here as a type alias (or type var???) to avoid repetition.

TransformType = tuple[VectorScale | PathScale] | tuple[VectorScale | PathScale, VectorTranslation | PathTranslation]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

definitely, we can define + export this union in the transformations module

NUM_TX_MAX = 2


def ensure_transform_dimensionality(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having read this function, to simplify it I would actually decouple the validation from the spec a bit, and write the validation function to check that an arbitrary list of transforms (that therefore have a .ndim attribute) have the same number of dimensions.

So:

  1. Remove all transforms from the list that are Path transforms
  2. If >= 2 transforms left, check they have same dimensions

I think this will make typing/logic a bit nicer, and also make this a bit more re-usable for whatever comes in coordinate transform changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea! I will implement this

_check_unique = field_validator("multiscales")(_unique_items_validator)


class MultiscaleGroup(GroupSpec[MultiscaleGroupAttrs, ArraySpec | GroupSpec]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this meant to be used? I am a bit confused/not understanding what the purpose of this class is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this class will model an actual zarr group, i.e. the attributes (which are JSON and ought to contain multiscales etc) and the arrays (which ought to be the ones named by the path attribute in datasets, and which ought to have shape that's consistent with the number of axes, etc)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this class is also the thing we deserialize a zarr group into

@d-v-b d-v-b merged commit 0d05526 into BioImageTools:main Nov 22, 2024
2 checks passed
@d-v-b d-v-b deleted the feat/multiscale-tests branch November 22, 2024 13:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants