Improvements on non-atomic value validation #179

nicolasblumenroehr · 2023-11-22T09:16:48Z

The record validation for nested structures could be standardized. Currently, it seems like the validation depends on the DTR validation schema. For example, the checksum validation, a valid value entry would be: "{'md5sum': '723140e4864011bdf1fbc66698a0f041'}". However, if there is no validation schema provided in the DTR for a PID-Info Type, there is no validation for nested structures at all. Furthermore, it would be more reasonable to put the PID of the keys within such structures instead of their name, as only the PID is persistent and recognizable by clients. In the checksum example this would then look like this: "{'21.T11148/ef277087753e8ba2e606': '723140e4864011bdf1fbc66698a0f041'}"

ThomasJejkal · 2023-11-22T16:52:52Z

Actually, I think its not a fault of the Typed PID Maker (or something which can/should be handled there), as it seems to use the provided validation schema of the DTR. If you take a look at checksum it consists of different possible attributes, which are data types by themself. Scrolling to the end, where the validation schema can be found, shows you, that the validation schema contains a set of definitions, containing the PIDs of the sub types (having slashes replaced by '_' for technical reasons), but the expected properties are the plain names of the particular type, e.g., md5sum.

I'm not sure why this is the case, maybe also technical reasons because an object with a key like

{'21.T11148/ef277087753e8ba2e606': '723140e4864011bdf1fbc66698a0f041'}

may cause problems if you want to select the value via JSON-Path (as the dots are typically interpreted as attribute separator). Maybe that's the reasons, maybe its just a logical mistake in the DTR.

However, in this particular case I would argue, that for something general like a checksum including its algorithm, a more simple type like etag might be the better choice in order not to go too deep into fine-grained type definitions.

nicolasblumenroehr · 2023-11-27T09:57:22Z

Right, the TPM uses the validation schema of the DTR entry, but if there is none validation schema provided, which may happen since the schema is not created automatically as it used to be, there is no validation at all. In this case, for multi-level data types you could then provide anything in the record value. Just wanted to point out that this might be an issue, maybe a validation schema should be a pre-requisite for record validation. Regarding the property names of the nested data types, I see this is again a DTR issue, but this causes big trouble for nested records as the PID of the sub data types goes lost and the name is essentially just a human readable representation

ThomasJejkal · 2023-11-27T12:48:35Z

I've assumed so far, that for all DataTypes validation schemas are created automatically by the DTR. Do you have an example of a DataType not having any validation information?

Pfeil · 2024-08-30T15:10:06Z

I read multiple ideas out of this:

A provided record to create could, instead of a PID as a value to an attribute, contain the record of another FAIR DO to create(?)
a. Not sure if this was really the idea or I misunderstood things
For validation, instead of the schema (which uses names) we could do the full validation manually. This would require even more requests, which means we would need to do extreme speedups in validation (see Potential timeout error #136 and Type-Api support and validation speedup #218 )
a. potentially possible, but requires performance improvements which can be hard to achieve according to Type-Api support and validation speedup #218
b. potentially incompatible with DTR schemas, if we are not careful
In general, for non-atomic values, use PIDs instead of attribute names
a. implies incompatibility to DTR schemas, but I like the idea

Pfeil changed the title ~~Record validation~~ Validation of records with nested records (nested profiles) and without provided schemas Aug 30, 2024

Pfeil added enhancement New feature or request question Further information is requested More information needed Assigned when not wnough information is there to fix or further discuss an issue. labels Aug 30, 2024

Pfeil changed the title ~~Validation of records with nested records (nested profiles) and without provided schemas~~ Improvements on non-atomic value validation Aug 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improvements on non-atomic value validation #179

Improvements on non-atomic value validation #179

nicolasblumenroehr commented Nov 22, 2023

ThomasJejkal commented Nov 22, 2023

nicolasblumenroehr commented Nov 27, 2023

ThomasJejkal commented Nov 27, 2023

Pfeil commented Aug 30, 2024 •

edited

Loading

Improvements on non-atomic value validation #179

Improvements on non-atomic value validation #179

Comments

nicolasblumenroehr commented Nov 22, 2023

ThomasJejkal commented Nov 22, 2023

nicolasblumenroehr commented Nov 27, 2023

ThomasJejkal commented Nov 27, 2023

Pfeil commented Aug 30, 2024 • edited Loading

Pfeil commented Aug 30, 2024 •

edited

Loading