Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Establish/document DataLad workflow for updating datasets on rolando with curation changes #17

Open
4 tasks
yarikoptic opened this issue May 7, 2021 · 0 comments

Comments

@yarikoptic
Copy link
Member

yarikoptic commented May 7, 2021

HeudiConv/ReproIn organized BIDS datasets are not curated. Labs ideally should fetch them using DataLad as described in instructions and curate and enhance them. As long as DataLad (or git/git-annex directly) are used, we can establish a nice and unambigous workflow to propagate those changes back to rolando's centralized collection of datasets. Curation entails

  • removal or renaming of __dups with potential changes to IntendedFor of fmaps .json files (under git) if populated, and changes to _scans.tsv files (under annex).
  • addressing TODOs as documented through out the dataset
  • populate _events.tsv files

To make it work we need to

  • establish git-annex storage per each dataset git-repo available to both DBIC personnel curating BIDS datasets (well, me ATM), and researchers. We could
    - use github for that with private repos, while automatically creating and populating repositories here. LFS for storing those _scans.tsv annexed files. That would also give us issues tracker etc. But even though private, not certain if "kosher". Also mapping rolando users to github might show to be pain
    - "mirror" entire hierarchy on rolando where we give write access to researchers to push changes. E.g. could be /inbox/BIDS-curated or alike (e.g. just a nearby bare git-annex repo with needed permissions for each dataset with -curated suffix in the name. The simplest way ATM and could be done on 'case by case' basis). Cons: no issue tracker. Pros: simple/easy
    - have gitlab DBIC instance provided somewhere internally. Pros: very featurefull, supports hierarchical organization, Cons: no git-annex support so we would still need the "mirror"
    - have gin (https://gin.g-node.org/) DBIC instance. Pros: has all needed features of github and supports git-annex .

@andycon WDYT about gitlab or gin instance for DBIC? In both gitlab and gin cases not sure on how easy to integrate with user accounts/permissions already out there. Needs "research"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant