-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Current state of the OpenNeuro dataset #35
Comments
FTR: With respect to re-converting |
Our local copy of the OpenNeuro dataset lives here: Based on the disk usage inspection, I assume that the download was successful:
To make sure that the "S3 bucket error" resulted in any content missing, I have run |
The goal now is to compare the current state of the OpenNeuro dataset against the two datasets that we are primarily interested in putting back to shape (i.e. the so called "phase1" and "phase2" datasets). The location of the OpenNeuro data: The location of the "phase1" data: The location of the "phase2" data: We want to generate lists of what's common and lists of what's unique:
My approach to the problem would be to compare the
That would result in a list of the following structure:
|
Looks sane to me :) And I'm very much interested in this:
for comparing to fresh conversion. |
I noticed something confusing to me:
This is the only sidecar file I could find, referring |
The final approach (proposed by @mih) to compare the datasets in a more human-readable fashion is the following.
To run the script:
|
There are some files missing in the open neuro dataset: Subject 5 only has 2 (instead of 8) physio files for the auditoryperception/pandora task:
There are no files for subject 7:
There are no files for subject 18
and subject 19
|
The pandora/auditoryperception session on open neuro has wrong stimulus file names and does not ship the audio stimulus files. |
The events.tsv files from pandora openneuro also messed up the run and run_id association: from open neuro:
from us:
|
@bpoldrack, I'm not sure if this is going to be useful in any way, but you can find some very simple scripts that use
Obviously, you need to know what to compare against what. If you have any thoughts on what could be improved for this to be useful, just let me know (pinging @mih). |
Just FTR: with regard to the presumed
I have run
Now, the FSL header information has been obtained with
The conclusion is that
|
Cool, that is good to know. Hence there is no point in trying to implement something like this in the conversion. Also the response to my original issues was along the lines of "unclear why". I'd say we stick with the output of the more modern converter the @bpoldrack is using. |
This issue is meant to record all aspects related to the current state of the dataset available at OpenNeuro:
https://openneuro.org/datasets/ds000113/versions/1.3.0
The dataset has been obtained with:
The local copy of the dataset is stored at:
/data/project/studyforrest/openneuro
.The text was updated successfully, but these errors were encountered: