-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposed new API for serializing custom types #815
Comments
So far we are more or less still prototyping with asdf so changes to the tag URI format are not really an issue up to this point. I like the idea of having more flexibility with handling different tags outside of the current "organization", "standard", and "name" structure. Some thoughts regarding the proposed changes off the top of my head:
I will have a closer look at the other changes as well. |
We definitely want keep the feature, dropping it from On the other hand, maybe it doesn't make sense to apply the custom validators to every object. It's possible that two extension developers will provide custom validators for the same property name with conflicting behavior. If we scoped the custom validator to only the |
I was about to ask the same question. I noticed that validators get registered globally regardless of which class I assign them to. So defining them on Regarding different validators per class I also thought about that and it does sound interesting. I am not sure about the current behavior but I doubt this is working with the current implementation. I must say I really like the current "global" behaviour since I don't have to worry about adding all my validators to all classes individually. Defining them together with the I guess the most flexible option could look somewhat like the following (no idea if this is possible to implement):
|
One more question regarding the following note @eslavich
What functionality would the additional interface provide? One current use case for Also, when dropping |
I haven't given this a lot of thought yet, but I find the current interface unsatisfying. There are potentially multiple ndarray objects that map to the same block, yet the So I'm vaguely hoping that we can devise a new interface that generalizes to other objects with blocks and returns an object of the correct type when the tree is read back in.
We definitely want to support that feature. At the very least it'll be possible using the advanced/extended interface (call it
I think I see the issue -- we might know that we want to manually set the tag to Removing |
yes, deciding on the correct version number was pretty much the same problematic use case I was thinking of as well Thank you for your explanations |
I think I have a good solution for the ctx variable: allow I'll link back to this issue when I open a PR. |
Just to have this here: I think some way to integrate the ideas behind asdf-format/asdf-standard#271 / #858 into the |
That feature would help us a lot, I think. We did a test drive of the proposed API and one class in particular ended up with a long clumsy list of tags: https://github.com/astropy/asdf-astropy/blob/master/asdf_astropy/tags/transform/compound.py#L38 That list would be lot nicer with wildcards: tags = {
"asdf://asdf-format.org/extensions/transform/tags/add-*",
"asdf://asdf-format.org/extensions/transform/tags/subtract-*",
"asdf://asdf-format.org/extensions/transform/tags/multiply-*",
"asdf://asdf-format.org/extensions/transform/tags/divide-*",
"asdf://asdf-format.org/extensions/transform/tags/power-*",
"asdf://asdf-format.org/extensions/transform/tags/compose-*",
"asdf://asdf-format.org/extensions/transform/tags/concatenate-*",
"asdf://asdf-format.org/extensions/transform/tags/fix_inputs-*",
"tag:stsci.edu:asdf/transform/add-*",
"tag:stsci.edu:asdf/transform/subtract-*",
"tag:stsci.edu:asdf/transform/multiply-*",
"tag:stsci.edu:asdf/transform/divide-*",
"tag:stsci.edu:asdf/transform/power-*",
"tag:stsci.edu:asdf/transform/compose-*",
"tag:stsci.edu:asdf/transform/concatenate-*",
"tag:stsci.edu:asdf/transform/fix_inputs-*",
} |
Here's another iteration on the converter API: asdf/asdf/extension/_converter.py Line 11 in 752d96e
I ended up bringing back the Tag patterns are permitted in the converter I've been thinking more about the validators and my latest idea is to define them on the extension itself, but optionally allow a list of tag URI patterns in case developers want to restrict their use to certain tags. The interface might look something like this: class Validator:
@property
def name(self):
# Return the name of the schema property that triggers the validation.
@property
def tags(self):
# Optional list of tag URIs or URI patterns that the validator should
# be used with.
def validate(self, schema_value, instance, tag, ctx):
# schema_value: the value assigned to the schema property
# instance: the YAML object to be validated
# tag: the YAML object's tag
# ctx: the same SerializationContext object as passed to the converters
#
# Implementation is expected to raise asdf.exceptions.ValidationError
# if the instance fails validation. My goal is to abstract away the jsonschema library so that we can swap in a faster implementation someday. Validation is currently the slowest part of opening an ASDF file. |
We've been considering a new API for serializing custom types and would like to solicit community input on our proposal. The original motivation for this was a desire to support multiple URI schemes in the same serializer. For example, as we move the transform schemas out of the ASDF standard and into their own package, we would like to change the current tag URIs:
to something like this:
The current API, ExtensionType, assumes the
tag:
URI scheme and forces each tag to share the same "organization", "standard", and "name". We thought we'd simplify by replacing those individual class attributes with a single list of full tags. Here's the proposed new API, called AsdfConverter:https://github.com/asdf-format/asdf/blob/asdf-standard-2.0.0-epic/asdf/converter.py#L28
I would like to point out several changes and consequences of this new design:
name
,organization
,standard
,version
,supported_versions
, andyaml_tag
class attributes have all been collapsed into one attribute,tags
, which in the case of the "add" transform schema we'd set like this:This gives developers the flexibility to use any URI scheme they choose, and allows for diverse tags to be handled by the same class.
Drop support for tags that aren't explicitly listed in the class attribute. The
ExtensionType
behavior is to use the serializer with the closest higher version if an exact match is unavailable. Instead, the library would refuse to handle the object if explicit support for the tag is unavailable, since to do otherwise is to risk losing data when the file is written back out.Drop the
handle_dynamic_subclasses
attribute and its associated feature. This is an experimental feature that writes a Python class name into the YAML so that the correct subclass can be automatically chosen when reconstructing the object on read. This is by definition implementation specific and we would prefer to encourage use of separate tags for these cases.The
ExtensionType
classes are capable of operating on themselves, that is to say, serializing and deserializing instances of the same class that defines thefrom_tree
andto_tree
behavior. In order to handle multiple tag versions, theExtensionType
metaclass would dynamically generate additional class definitions for each version. All this made for a system that was difficult to understand and troubleshoot, so withAsdfConverter
we're not permitting self-serialization and are creating instances of the class to handle each tag.Rename
from_tree
andto_tree
tofrom_yaml_tree
andto_yaml_tree
. The renaming is to help remind us what each method is intended to do, since in both cases the input and output are both trees.Drop the
from_tree_tagged
andto_tree_tagged
methods. These existed to allow developers to provide custom logic for choosing a tag. Now,to_tree_yaml
is permitted to return a tagged object directly.Drop the
ctx
variable from thefrom_yaml_tree
andto_yaml_tree
methods. In most cases, use of thectx
is not necessary -- only converters that deal directly with binary blocks need access to it. We're planning to introduce an additional interface for that type of converter.Drop the
requires
class attribute. Most implementations will reside in a package that lists required dependencies in the package metadata. Those that don't are well served by raisingImportError
from within thefrom_yaml_tree
method.The
validators
class attribute is not yet implemented forAsdfConverter
, but I suspect we'll need it. That feature is used internally to support custom validations on the shape and datatype of the ndarray objects.@Cadair @CagtayFabry I know you've both implemented support for custom types using the old API -- we'd appreciate hearing any thoughts you have on the new one.
The text was updated successfully, but these errors were encountered: