Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

special_cases::harvester::ignore_processed_files should allow for in-list reference merge #361

Open
shippy opened this issue Feb 4, 2020 · 0 comments

Comments

@shippy
Copy link
Contributor

shippy commented Feb 4, 2020

In special_cases::harvester::ignore_processed_files::[site] (introduced in #341), we use a YAML reference to an array of files. At the time, I thought this would be both DRY and easily extensible - YAML can use references to merge dicts, so why not arrays?

It turns out that YAML can't merge arrays, at least not in a way that results in a predictably flat results. (- *DUPLICATED_PASAT_FILES translates to a list of lists.) Under the current structure, you can only extend the referenced array, or you're out of luck.

A workaround is a value-less dict that looks like this:

harvester:
  ignore_processed_paths:
    _duplicates: &DUPLICATED_PASAT_FILES
      ? E-01099-M-7-2017-09-19.csv
      ? E-01318-M-2-2017-08-28.csv
    ohsu:
      pasat: 
        <<: *DUPLICATED_PASAT_FILES
        ? OHSU-specific-file.csv
    sri:
      pasat: 
        <<: *DUPLICATED_PASAT_FILES
        ? sri-specific-file.csv

After parsing, this comes out as

{'harvester': 
  {'ignore_processed_paths': 
    {'_duplicates': {'E-01099-M-7-2017-09-19.csv': None,
                     'E-01318-M-2-2017-08-28.csv': None},
      'ohsu': 
        {'pasat': {'E-01099-M-7-2017-09-19.csv': None,
                   'E-01318-M-2-2017-08-28.csv': None,
                   'OHSU-specific-file.csv': None}},
      'sri': 
        {'pasat': {'E-01099-M-7-2017-09-19.csv': None,
                   'E-01318-M-2-2017-08-28.csv': None,
                   'sri-specific-file.csv': None}}}}}

This should only require a straightforward change to config_utils.py::flatten_path_dict, which checks if val is None, and if so, output.append(new_prefix).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant