You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After some experimentation with asdf-format/asdf#446 and some discussions with @Cadair about other potential use cases, I think it would be useful if tags were allowed to contain URI fragments with implementation-specific annotations.
Let's start with a concrete example for illustration. Imagine there is a schema definition for a type called Foo. The associated tag label is "tag:example.org:custom/foo-1.0.0". My instances of Foo are serialized using this schema definition.
However, imagine that I also have a type called Bar which is a subclass of Foo. I would like to serialize Bar and validate it using the Foo schema, but I want to be able to round-trip instances of Bar. How can I do this without adding a new schema definition?
The solution I am suggesting here is to allow my implementation to add a URI fragment to the tag label that indicates that the Foo schema should be used for validation purposes, but I have actually serialized a subclass of this type. Such a tag label might look like this:
"tag:example.org:custom/foo-1.0.0#subclass=Bar".
Basically, what I am proposing is that the validation machinery should ignore all URI fragments and only use the tag itself when resolving schemas. However, implementationsmay take URI fragments into account when processing types. In this case, my implementation would recognize the subclass=Bar portion of the tag as indicating that a different subclass should be used when restoring this type.
It is important to note that the YAML spec has some relevant language about this:
YAML does not mandate any special relationship between different tags that begin with the same substring. Tags ending with URI fragments (containing “#”) are no exception; tags that share the same base URI but differ in their fragment part are considered to be different, independent tags. By convention, fragments are used to identify different “variants” of a tag, while “/” is used to define nested tag “namespace” hierarchies. However, this is merely a convention, and each tag may employ its own rules. For example, Perl tags may use “::” to express namespace hierarchies, Java tags may use “.”, etc.
I take this to mean that by tags that contain URI fragments are valid in YAML, although they do not necessarily indicate any particular relationship between tags. However, it also seems to imply that it would be perfectly reasonable for ASDF to encode such relationships using URI fragments.
Let's return to the example of the subclass Bar. In all likelihood, I want to serialize Bar instead of Foo because Bar contains some properties that are not adequately represented by Foo. When I serialize my Bar instance, I may be storing attributes that aren't described by the Foo schema. If a different implementation reads my file, and if it is free to ignore the "subclass=Bar" annotation, it may miss these additional properties entirely. This could be problematic.
However, I would argue that this situation is already possible in ASDF. If a schema does not explicitly set additionalProperties: False, then any specific implementation is free to add properties that will not be validated by the schema. So I would argue that we're not any worse off by allowing subclasses to be encoded in this way. Naturally, it follows that subclasses can only be created for types with schemas where additionalProperties: True (which is the default).
There are a few other things to consider:
This is possible to handle in the Python implementation, but will it make other potential implementations more difficult? As far as I can tell, the C++ prototype should be able to handle this change.
Should specific query words such as "subclass" be recognized by the standard? Maybe a few such words can be encoded in the standard, but implementations are free to define others that are not explicitly in the standard.
The text was updated successfully, but these errors were encountered:
After some experimentation with asdf-format/asdf#446 and some discussions with @Cadair about other potential use cases, I think it would be useful if tags were allowed to contain URI fragments with implementation-specific annotations.
Let's start with a concrete example for illustration. Imagine there is a schema definition for a type called
Foo
. The associated tag label is"tag:example.org:custom/foo-1.0.0"
. My instances ofFoo
are serialized using this schema definition.However, imagine that I also have a type called
Bar
which is a subclass ofFoo
. I would like to serializeBar
and validate it using theFoo
schema, but I want to be able to round-trip instances ofBar
. How can I do this without adding a new schema definition?The solution I am suggesting here is to allow my implementation to add a URI fragment to the tag label that indicates that the
Foo
schema should be used for validation purposes, but I have actually serialized a subclass of this type. Such a tag label might look like this:"tag:example.org:custom/foo-1.0.0#subclass=Bar"
.Basically, what I am proposing is that the validation machinery should ignore all URI fragments and only use the tag itself when resolving schemas. However, implementations may take URI fragments into account when processing types. In this case, my implementation would recognize the
subclass=Bar
portion of the tag as indicating that a different subclass should be used when restoring this type.It is important to note that the YAML spec has some relevant language about this:
I take this to mean that by tags that contain URI fragments are valid in YAML, although they do not necessarily indicate any particular relationship between tags. However, it also seems to imply that it would be perfectly reasonable for ASDF to encode such relationships using URI fragments.
Let's return to the example of the subclass
Bar
. In all likelihood, I want to serializeBar
instead ofFoo
becauseBar
contains some properties that are not adequately represented byFoo
. When I serialize myBar
instance, I may be storing attributes that aren't described by theFoo
schema. If a different implementation reads my file, and if it is free to ignore the"subclass=Bar"
annotation, it may miss these additional properties entirely. This could be problematic.However, I would argue that this situation is already possible in ASDF. If a schema does not explicitly set
additionalProperties: False
, then any specific implementation is free to add properties that will not be validated by the schema. So I would argue that we're not any worse off by allowing subclasses to be encoded in this way. Naturally, it follows that subclasses can only be created for types with schemas whereadditionalProperties: True
(which is the default).There are a few other things to consider:
"subclass"
be recognized by the standard? Maybe a few such words can be encoded in the standard, but implementations are free to define others that are not explicitly in the standard.The text was updated successfully, but these errors were encountered: