-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support no-prefix metadata records (one record per directory) #50
Comments
I think that is a useful structure |
Does this imply to serialize a dataset tree (git repos with submodules) into a collection of directories, which are structured as described above? IIUC that would require the possibility for a sub-module entry in a tabby-record. Or did I misunderstand the intention here? |
Yes, your conclusion is correct. |
I implemented a no-prefix tabby file collection, ie. one where there would be a plain I found this to be cumbersome and full of corner cases. For example, conversion to XLSX would required to invent a file name (because the component names would vanish, and without a prefix there would be nothing left, apart from I conclude that making this "simplification" actually leads to a complication of code, and worse 3rd-party handling code. This realization does not impact the general organization of
except that any file in |
In preparation for #79 and despite the conclusion in #50 this change adds support for a simplified set of files that form a tabby record. The only thing that is simplified is that the common prefix is removed from all filename. The demo record is not also included in this format. This layout is what we would like put into a ZIP file container. The prefix continues to exist (this was the main concern in #50), but is now the name of the parent directory. In #55 this simplifies the setup for the self-description of a dataset. All files could go into `.datalad/tabby/self/` and have short names like: - `dataset.tsv` - `dataset.override.json` - ... There is no particular additional markup necessary to distinguish single-item-dir format from the prefixed-layout. The absence of an underscore char, is evidence enough. Closes #50 (for real)
In preparation for #79 and despite the conclusion in #50 this change adds support for a simplified set of files that form a tabby record. The only thing that is simplified is that the common prefix is removed from all filename. The demo record is not also included in this format. This layout is what we would like put into a ZIP file container. The prefix continues to exist (this was the main concern in #50), but is now the name of the parent directory. In #55 this simplifies the setup for the self-description of a dataset. All files could go into `.datalad/tabby/self/` and have short names like: - `dataset.tsv` - `dataset.override.json` - ... There is no particular additional markup necessary to distinguish single-item-dir format from the prefixed-layout. The absence of an underscore char, is evidence enough. Closes #50 (for real)
#48 made me think about multi-record organizations. I think it should be useful to support schemes like
<root> / <dataset-id> / <dataset-version> / <tabby-collection>
This would make a common prefix unnecessary. A record would have files like
dataset.tsv
authors.tsv
authors.override.json
and all files of a record (and only those) would be contained in a (versioned) directory.
The text was updated successfully, but these errors were encountered: