-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Detect and warn if empty entries in config.stratigraphy #386
Detect and warn if empty entries in config.stratigraphy #386
Conversation
@@ -109,8 +115,51 @@ def _check_global_config( | |||
"One or more keys required for valid metadata are not found: " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
msg += ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
src/fmu/dataio/dataio.py
Outdated
# check "stratigraphy" | ||
if "stratigraphy" in globalconfig: | ||
# we currently allow (why?) stratigraphy key missing from config. | ||
strat = globalconfig["stratigraphy"] | ||
if not isinstance(strat, dict): | ||
msg += "The 'stratigraphy' must be a dictionary.\n" | ||
|
||
# Loop the entries in 'stratigraphy' | ||
# These keys are custom, but we want error messages to point to the key when | ||
# issues are discovered. This makes it tricky to use jsonschema. Pydantic might | ||
# be an option. But for now, just go through all items and do defined checks. | ||
|
||
for key, item in strat.items(): | ||
if "name" not in item: | ||
msg += f"stratigraphy.{key}: 'name' is missing. \n" | ||
elif not isinstance(item["name"], str): | ||
msg += f"stratigraphy.{key}: 'name' must be a string.\n" | ||
|
||
if "stratigraphic" not in item: | ||
msg += f"stratigraphy.{key}: 'stratigraphic' is missing.\n" | ||
elif not isinstance(item["stratigraphic"], bool): | ||
msg += f"stratigraphy.{key}: 'stratigraphic' must be a boolean.\n" | ||
|
||
if "alias" in item: | ||
if not isinstance(item["alias"], list): | ||
msg += f"stratigraphy.{key}: 'alias' must be a list.\n" | ||
else: | ||
for alias in item["alias"]: | ||
if not isinstance(alias, str): | ||
msg += ( | ||
f"stratigraphy.{key}: 'alias' items must be strings\n" | ||
) | ||
|
||
if "stratigraphic_alias" in item: | ||
if not isinstance(item["stratigraphic_alias"], list): | ||
msg += f"stratigraphy.{key}: 'stratigraphic_alias' must be list.\n" | ||
else: | ||
for alias in item["stratigraphic_alias"]: | ||
if not isinstance(alias, str): | ||
msg += f"stratigraphy.{key}: 'stratigraphic_alias' items " | ||
msg += "must be strings.\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gets many lines of code. Could be an idea to refactor check of the global config to a new module (even a class)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, agree. This doesn't scale very well. Have explored using a schema for this, and then using jsonschema
to validate but experiments show it's not a great solution (either). Pydantic is also a possibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I discussed this a little bit also in #383. This sounds like a separate issue, perhaps, and a slightly bigger job. But also sounds that it makes sense to carve out the functionality of the global_variables. Perhaps the config should be a class, rather than a dictionary inside fmu-dataio 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we in general detect entries will null content, like the example in #386 ? In this case:
alias:
- MyFormation
- OtherAlias
- # <-- PROBLEM!
Just remove the entry at the PROBLEM line?
I guess we can sanitize and remove before we use it, but that will leave the problem in place for the next application to discover it. Idea was that it would be better to address it, so that it would be fixed. But we can also do both, I think. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Solve #383. But there are probably better ways. See issue for details.