Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic validation based on keys in data (internal reference) #253

Open
idantene opened this issue Oct 31, 2024 · 7 comments
Open

Dynamic validation based on keys in data (internal reference) #253

idantene opened this issue Oct 31, 2024 · 7 comments

Comments

@idantene
Copy link
Contributor

idantene commented Oct 31, 2024

Hey,

I have a thought for dynamic schema validation that I think is currently lacking (perhaps not too common request, though I believe it can be implemented as another validator).

In this case, I have a data YAML file where a mapping is expected. The keys may be anything the user decides, and the values are according to some predefined schema - so far, so good.

Next, at a later part in the data YAML, some other mapping occurs. Here, the value of the mapping has to relate to the keys defined earlier.

For example (a data YAML):

metrics:
  iou:  # this can be whatever the user chooses
    name: foo
    unit: bar
    direction: up
...
results:
  - name: something
    metric: iou  # this has to be one of the keys defined under `metrics` above
    value: 59
@nbaju1
Copy link

nbaju1 commented Nov 6, 2024

Similar feature has been requested earlier: #154

@idantene
Copy link
Contributor Author

idantene commented Nov 7, 2024

I hadn't noticed that one, my apologies.

I'm not sure why it's not within the scope of this project. Internal references are common, and since the file is read anyway, the schema can be dynamic in that sense...

@nbaju1
Copy link

nbaju1 commented Nov 7, 2024

The validators themselves only have access to the object being validated, so I imagine this would require quite a large refactoring of the project to support this.

@idantene
Copy link
Contributor Author

idantene commented Nov 7, 2024

Is that the case? It's been a while since I contributed to the project so I don't recall the details, but it seems that in schema.py#L80, the full contents of data are passed around.

That would suggest, for example, that this type of validators would be deferred until data is provided.

@nbaju1
Copy link

nbaju1 commented Nov 7, 2024

Its the Validator class that is used to validate an object, which only receives the object it self, not the full yaml file. So this can't be solved by simply introducing a new validator.

@idantene
Copy link
Contributor Author

idantene commented Nov 7, 2024

Of course this would require a bit more complicated implementation (i.e. not simply subclassing the Validator class). That shouldn't be a problem though.

If I understand correctly, the full flow is as follows:

  1. Create a dictionary of validators in Schema (method _process_schema returns a dictionary which is then assigned to self._schema).
  2. Calls to Schema.validate pass the full YAML data content.
  3. The validate method passes the dictionary self._schema, which then winds down to _validate_static_map_list method.
  4. In _validate_static_map_list, the keys of the data and the keys of the validator map are compared. If they mismatch, an error is raised. If they do match, we start iterating on a per-key basis by passing along the sub_validator, the key, and again, the full YAML data, to _validate_item.
  5. Finally, in _validate_item, we try to pull the relevant data-item from the full (or parent, since it recurses down eventually) YAML content, and then call _validate with the validator and data-item.

So, my suggestion would then be to allow a deferment of validators at a higher level here. For example, a Validator could help a boolean is_deferred, in which case, we do not attempt to pull the specific data-item from the YAML content, but rather pass along either the parent/full YAML data, depending on the reference type, for example.

@abourree
Copy link
Contributor

abourree commented Nov 7, 2024

Hi,

Five years ago, I propose two new Validators in #82 that may answer your need. My PR was refused because leaders doesn't want dynamics schema. In some way that's a good practice to have static shema.

Arnaud.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants