Data sources should share consistent metdata record types/entries in configuration #472

carlhiggs · 2024-08-08T00:06:54Z

Currently, there are similar but different records gathered for different data sources when these are configured.

For example,

data paths (some entries ask for data, others for data_dir, regardless of whether a file or folder can be configured. Should just be consistent, e.g. 'data')
some ask for 'name', but others don't
some ask for 'licence', while others don't (e.g. in custom aggregation)
ditto citation

et cetera.

Really, there should be a base generic class for data sources that has the minimum required information that should be associated with data, and then specific datasets could build off this extending with data-specific attributes as required.

This will make configuration easier to both complete and code for (e.g. #414), by being consistent, and more maintainable as the type definitions will be centralised in shared classes.

Implementing this would mean a breaking change, as configuration formatting would be updated to use the more consistent record gathering. Potentially, a script to update/translate older configuration files to a newer format could be developed, but may not be necessary, e.g. if configuration files work with a specific software version. However, that may not be necessary, as we already record the study region template version in the header comment (e.g. v4.2.2). Future configuration versions could have this not in a comment, but rather as a parameter up the top that can be checked. In this way, older configurations could be distinguished from new ones. This is a seperate issue really.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data sources should share consistent metdata record types/entries in configuration #472

Data sources should share consistent metdata record types/entries in configuration #472

carlhiggs commented Aug 8, 2024

Data sources should share consistent metdata record types/entries in configuration #472

Data sources should share consistent metdata record types/entries in configuration #472

Comments

carlhiggs commented Aug 8, 2024