Reconversion of phase1 data from raw into bids #29

mih · 2021-04-22T11:17:05Z

bpoldrack · 2021-04-22T11:35:53Z

I think it depends on what's most important. I can see the argument for the first approach.
If highest priorities are a) matching the state in OpenNeuro and b) minimizing workload then this looks like a good idea.

However, I'd argue that there is an additional Pro for the second approach: We'd have a raw dataset + specifications, that we can much more easily apply any different conversions in the future on. Thinking of significant changes in BIDS or yet another standard we'd want to represent the data in. It may also be easier to get new metadata standards/formats etc. in the future.

So, not exactly sure. Need to have a closer look into the existing repo to see what we may loose that way.

One more thing: If we actually can have time traveling containers that can reproduce what was done back then, it seems to me that we can have a third approach: a merger of both. Nothing forces us to use hirni with the current toolbox.

However, if we go for 1) or 3), I'll need help figuring out what was done how and therefore how to build the container(s) and may be break things down into a few procedures. Inspecting that on my own sounds like it'll take too long.

mih · 2021-04-22T11:50:27Z

I'd say we go for (2) hirni in this case.

loj · 2021-04-26T12:23:27Z

@bpoldrack here is the BIDS spec for the diffusion data https://bids-specification.readthedocs.io/en/stable/04-modality-specific-files/01-magnetic-resonance-imaging-data.html#diffusion-imaging-data

Susceptibility Weighted Imaging (SWI) is still a BEP (https://docs.google.com/document/d/1kyw9mGgacNqeMbp4xZet3RnDhcMmf4_BmRgKaOkO2Sc/edit)

bpoldrack · 2021-04-27T09:27:29Z

Referencing #34 (comment)

bpoldrack · 2021-04-27T09:29:00Z

Note: Instead of phase1, now aim for publication related targets.

Comparison of conversion outcome is to be made against anondata.

bpoldrack · 2021-04-27T09:34:08Z

Starting to look into conversion issues.

Task labels (see #35 (comment)): anondata uses numbered tasks, whereas an earlier attempt to redo the conversion used aomovie and pandora. OpenNeuro apparently uses its own names:

ds000113 on git:master
❱ ls task*
task-auditoryperception_bold.json  task-movielocalizer_bold.json    task-objectcategories_physio.json    task-retmapccw_physio.json  task-retmapcon_physio.json
task-coverage_bold.json            task-movielocalizer_physio.json  task-orientation_bold.json           task-retmapclw_bold.json    task-retmapexp_bold.json
task-coverage_rec-dico_bold.json   task-movie_physio.json           task-orientation_rec-dico_bold.json  task-retmapclw_physio.json  task-retmapexp_physio.json
task-movie_bold.json               task-objectcategories_bold.json  task-retmapccw_bold.json             task-retmapcon_bold.json

What labels do we settle for, @mih ?

Edit:
Verdict: Take from OpenNeuro. Ergo: forrestgump for 7T_ad (+anatomy?) and auditoryperception for pandora.

adswa · 2021-04-29T08:47:21Z

/data/project/studyforrest_phase1/testing/scientific-data-2014-bids currently does not contain any functional nifits

adswa · 2021-04-29T09:29:23Z

Here is a potential structure for the events files of the pandora data. They should be derived from the behavdata.tsv files with a script. Most variables can be pulled out verbatim, just onset and duration need to be computed like this from the trial duration (6 seconds) plus the trial-specific delay

events.tsv

onset	duration	trial_type	run	run_id	volume	       run_volume	stim	genre	delay	catch	sound_soa	trigger_ts
0         6.0           <genre>        <run>  <run_id>   <volume>  <run_volume>         <stim> <genre> <delay> <catch>  <sound_soa> <trigger_ts>

The corresponding json file should look about like this:
events.json

{
    "trial_type": {
        "LongName": "Event category",
        "Description": "Indicator of the genre of the musical stimulus",
        "Levels": {
            "country": "Country music",
            "symphonic": "Symphonic music",
            "metal": "metal music",
            "ambient": "ambient music",
            "rocknroll": "rocknroll music"
        }
    },
    "sound_soa": {
        "LongName": "Sound onset asynchrony",
        "Description": "asynchrony between MRI trigger and sound onset",
    },
    "catch": {
        "LongName": "Control question",
        "Description": "flag whether a control question with presented",
    },
    "volume": {
        "LongName": "fMRI volume total",
        "Description": "fMRI volume corresponding to stimulation start",
    },
    "run_volume": {
        "LongName": "fMRI volume run",
        "Description": "fMRI volume corresponding to stimulation start in the current run",
    },
    "run": {
        "LongName": "Run in Sequence",
        "Description": "order of run in sequence ",
    },
    "run_id": {
        "LongName": "Trial ID in Run",
        "Description": "ID of trial sequence for this run",
    },
    "stim": {
        "LongName": "Stimulation file",
        "Description": "stimulus file name",
    },
    "delay": {
        "LongName": "inter-stimulus interval",
        "Description": "inter-stimulus interval in seconds",
        "Units": "seconds"
    },
    "trigger_ts": {
        "LongName": "Trigger time stamp",
        "Description": "time stamp of the corresponding MRI trigger with respect to the start of the experiment in seconds ",
        "Units": "seconds"
    },
    "genre": {
        "LongName": "Genre",
        "Description": "Indicator of the genre of the musical stimulus",
        "Levels": {
            "country": "Country music",
            "symphonic": "Symphonic music",
            "metal": "metal music",
            "ambient": "ambient music",
            "crocknroll": "rocknroll music"
        }
    }
}

bpoldrack · 2021-04-29T09:39:23Z

For anatomy I have the following image series.

SeriesNumber, Protocol, currently assigned modality, whether currently converted or ignored for conversion:

(101, 'SmartBrain_32channel', None, 'ignored'),
(102, 'SmartBrain_32channel', None, 'ignored'),
(103, 'Patient Aligned MPR AWPLAN_SMARTPLAN_TYPE_BRAIN', None, 'ignored'),
(201, 'B1_calibration_brain', None, 'ignored'),
(202, 'B1_calibration_brain', None, 'ignored'),
(301, 'Ref_Head_32', None, 'ignored'),
(401, 'sT1W_3D_TFE_TR2300_TI900_0.7iso_FS', 't1w', 'converted'),
(501, 'VEN_BOLD_HR_32chSHC', 'swi', 'converted'),
(502, 'VVEN_BOLD_HR_32chSHC SENSE', 'swi', 'ignored'),
(601, 'sT2W_3D_TSE_32chSHC_0.7iso', 't2w', 'converted'),
(701, 'DTI_high_2iso', 'dwi', 'ignored'),
(702, 'Reg - DTI_high_iso', 'dwi', 'converted'),
(703, 'dReg - DTI_high_iso', 'dwi', 'ignored'),
(704, 'eReg - DTI_high_iso', 'dwi', 'ignored'),
(705, 'faReg - DTI_high_iso', 'dwi', 'ignored'),
(706, 'facReg - DTI_high_iso', 'dwi', 'ignored'),
(801, 'field map', 'fieldmap', 'ignored')

Questions: Something that is ignored, but should be converted? Which ones should be assigned modality veno and angio?

loj · 2021-04-29T09:47:38Z

Suggested content for the dataset_descrption.json:

{
    "Name": "scientific-data-2014-bids",
    "BIDSVersion": "TODO",
    "DatasetType": "raw",
    "License": "CC0",
    "Authors": [
        "Michael Hanke",
        "Florian J. Baumgartner",
        "Pierre Ibe",
        "Falko R. Kaule",
        "Stefan Pollmann",
        "Oliver Speck",
        "Wolf Zinke",
        "Jorg Stadler",
        "Richard Dinga",
        "Christian Häusler",
        "J. Swaroop Guntupalli",
        "Michael Casey"
    ],
    "Acknowledgements": "",
    "HowToAcknowledge": "Please follow good scientific practice by citing the most appropriate publication(s) describing the aspects of this datasets that were used in a study.",
    "Funding": [
        "A grant from the German Federal Ministry of Education and Research (BMBF) funded the initial data acquisition as part of the US-German collaboration in computational neuroscience (CRCNS) project: Development of general high-dimensional models of neuronal representation spaces (Haxby / Ramagde / Hanke), co-funded by the BMBF and the US National Science Foundation (BMBF 01GQ1112; NSF 1129855).",
        "We acknowledge the support of the Combinatorial NeuroImaging Core Facility at the Leibniz Institute for Neurobiology in Magdeburg.",
        "Moreover, development of data sharing technology used for dissemination and management of this dataset is supported by another US-German collaboration grant awarded to Halchenko and Hanke: DataLad: Converging catalogues, warehouses, and deployment logistics into a federated 'data distribution', also co-funded by BMBF (01GQ1411) and NSF (1129855).",
        "The German federal state of Saxony-Anhalt and the European Regional Development Fund (ERDF), Project: Center for Behavioral Brain Sciences provided support for data acquisition hardware and personnel."
    ],
    "EthicsApprovals": [
        ""
    ],
    "ReferencesAndLinks": [
        "http://studyforrest.org",
        "https://www.nature.com/articles/sdata20143",
        "https://f1000research.com/articles/4-174/v1"
    ],
    "DatasetDOI": "TODO"
}

TODOs:

possibly change name of the dataset to "studyforrest phase1"
add BIDS version (1.6.0 probably)
add DOI

Note: This is different from the OpenNeuro dataset_description.json, but that is intended since this dataset includes only data from "phase1".

bpoldrack · 2021-04-29T09:48:48Z

Current image series for 7T_ad:

(1, 'AAHScout_32ch', None, 'ignored'),
(2, 'AAHScout_32ch_MPR', None, 'ignored'),
(3, 'b1map_658', None, 'ignored'),
(4, 'b1map_658', None, 'ignored'),
(5, 'CV_shim_452B', None, 'ignored'),
(6, 'mi_ep2d_flashref_psf_160_p3_1.4mm_7p8_36sl', None, 'ignored'),
(7, 'mi_ep2d_flashref_psf_160_p3_1.4mm_7p8_36sl_PostProc', None, 'ignored'),
(8, 'mi_ep2d_flashref_psf_160_p3_1.4mm_7p8_36sl_DiCo', None, 'ignored'),
(9, 'mi_ep2d_flashref_bold_160_iPat3_1.4mm_36sl_R1', 'bold', 'ignored'),
(10, 'MoCoSeries_DiCo', 'bold', 'converted'),
(11, 'mi_ep2d_flashref_bold_160_iPat3_1.4mm_36sl_R2', 'bold', 'ignored'),
(12, 'MoCoSeries_DiCo', 'bold', 'converted'),
(13, 'mi_ep2d_flashref_bold_160_iPat3_1.4mm_36sl_R3', 'bold', 'ignored'),
(14, 'MoCoSeries_DiCo', 'bold', 'converted'),
(15, 'mi_ep2d_flashref_bold_160_iPat3_1.4mm_36sl_R4', 'bold', 'ignored'),
(16, 'MoCoSeries_DiCo', 'bold', 'converted'),
(99, 'PhoenixZIPReport', None, 'ignored')

(2, 'AAHScout_32ch_MPR', None, 'ignored'),
(1, 'AAHScout_32ch', None, 'ignored'),
(3, 'CV_shim_452B', None, 'ignored'),
(4, 'mi_ep2d_flashref_psf_160_p3_1.4mm_7p8_36sl', None, 'ignored'),
(5, 'mi_ep2d_flashref_psf_160_p3_1.4mm_7p8_36sl_PostProc', None, 'ignored'),
(7, 'mi_ep2d_flashref_bold_160_iPat3_1.4mm_36sl_R5', 'bold', 'ignored'),
(8, 'MoCoSeries_DiCo', 'bold', 'converted'),
(9, 'mi_ep2d_flashref_bold_160_iPat3_1.4mm_36sl_R6', 'bold', 'ignored'),
(10, 'MoCoSeries_DiCo', 'bold', 'converted'),
(11, 'mi_ep2d_flashref_bold_160_iPat3_1.4mm_36sl_R7', 'bold', 'ignored'),
(12, 'MoCoSeries_DiCo', 'bold', 'converted'),
(13, 'mi_ep2d_flashref_bold_160_iPat3_1.4mm_36sl_R8', 'bold', 'ignored'),
(14, 'MoCoSeries_DiCo', 'bold', 'converted'),
(99, 'PhoenixZIPReport', None, 'ignored'),
(6, 'mi_ep2d_flashref_psf_160_p3_1.4mm_7p8_36sl_DiCo', None, 'ignored')

bpoldrack · 2021-04-29T10:08:15Z

And pandora, @mih :

(2, 'AAHScout_32ch_MPR', None, 'ignored'),
(1, 'AAHScout_32ch', None, 'ignored'),
(3, 'b1map_658', None, 'ignored'),
(4, 'b1map_658', None, 'ignored'),
(5, 'CV_shim_452B', None, 'ignored'),
(6, 'mi_ep2d_flashref_psf_160_p3_1.4mm_7p8_36sl', None, 'ignored'),
(7, 'mi_ep2d_flashref_psf_160_p3_1.4mm_7p8_36sl_PostProc', None, 'ignored'),
(9, 'mi_ep2d_flashref_bold_160_iPat3_1.4mm_36sl_R1P', 'bold', 'ignored'),
(10, 'MoCoSeries_DiCo', 'bold', 'converted'),
(11, 'mi_ep2d_flashref_bold_160_iPat3_1.4mm_36sl_R2P', 'bold', 'ignored'),
(12, 'MoCoSeries_DiCo', 'bold', 'converted'),
(13, 'mi_ep2d_flashref_bold_160_iPat3_1.4mm_36sl_R3P', 'bold', 'ignored'),
(14, 'MoCoSeries_DiCo', 'bold', 'converted'),
(15, 'mi_ep2d_flashref_bold_160_iPat3_1.4mm_36sl_R4P', 'bold', 'ignored'),
(16, 'MoCoSeries_DiCo', 'bold', 'converted'),
(17, 'mi_ep2d_flashref_bold_160_iPat3_1.4mm_36sl_R5P', 'bold', 'ignored'),
(18, 'MoCoSeries_DiCo', 'bold', 'converted'),
(19, 'mi_ep2d_flashref_bold_160_iPat3_1.4mm_36sl_R6P', 'bold', 'ignored'),
(20, 'MoCoSeries_DiCo', 'bold', 'converted'),
(21, 'mi_ep2d_flashref_bold_160_iPat3_1.4mm_36sl_R7P', 'bold', 'ignored'),
(22, 'MoCoSeries_DiCo', 'bold', 'converted'),
(23, 'mi_ep2d_flashref_bold_160_iPat3_1.4mm_36sl_R8P', 'bold', 'ignored'),
(24, 'MoCoSeries_DiCo', 'bold', 'converted'),
(25, 'ToF-3D-multi-slab-0.3iso_FA20_claus', 'angio', 'converted'),
(99, 'PhoenixZIPReport', None, 'ignored'),
(8, 'mi_ep2d_flashref_psf_160_p3_1.4mm_7p8_36sl_DiCo', None, 'ignored')

adswa · 2021-04-29T12:28:10Z

I found a trigger mismatch in the converted Physio files from the audiomovie in your test repo (/data/project/studyforrest_phase1/testing/scientific-data-2014-bids/sub-002/func), @bpoldrack.

The test checks whether it finds the same number as triggers as the run had TRs:

check_physio()
{
    nvols=$1
    shift
    for f in $@; do
        found_trigger="$(zgrep '^1' "$f" | wc -l)"
        assertEquals "Need to find each trigger in the log" "$nvols" "$found_trigger"
    done
}

test_physio_movie_runs()
{
  count=1
  for nvols in 451 441 438 488 462 439 542 338; do
      check_physio $nvols *_task-forrestgump_run-0${count}_physio.tsv.gz
      count=$(( $count + 1 ))
  done
}

It fails for a few subjects in the conversion.

subject 1

ASSERT:Need to find each trigger in the log expected:<451> but was:<434>
ASSERT:Need to find each trigger in the log expected:<441> but was:<424>
ASSERT:Need to find each trigger in the log expected:<438> but was:<421>
ASSERT:Need to find each trigger in the log expected:<488> but was:<466>
ASSERT:Need to find each trigger in the log expected:<462> but was:<440>
ASSERT:Need to find each trigger in the log expected:<439> but was:<422>
ASSERT:Need to find each trigger in the log expected:<542> but was:<520>
ASSERT:Need to find each trigger in the log expected:<338> but was:<321>

subject 2:

ASSERT:Need to find each trigger in the log expected:<451> but was:<433>
ASSERT:Need to find each trigger in the log expected:<441> but was:<424>
ASSERT:Need to find each trigger in the log expected:<438> but was:<421>
ASSERT:Need to find each trigger in the log expected:<488> but was:<466>
ASSERT:Need to find each trigger in the log expected:<462> but was:<440>
ASSERT:Need to find each trigger in the log expected:<439> but was:<423>
ASSERT:Need to find each trigger in the log expected:<542> but was:<520>
ASSERT:Need to find each trigger in the log expected:<338> but was:<322>

subject 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17 are good!

subject 18: (only fails with one trigger)

ASSERT:Need to find each trigger in the log expected:<451> but was:<435>

subject 19 and 20 are good.

It does not fail for the OpenNeuro dataset.

loj · 2021-04-29T13:18:12Z

Reminder to add the scanner acquisition protocols to sourcedata/acquisition_protocols in the converted BIDS dataset.

adswa · 2021-04-29T13:42:12Z

Doing the same assertion for the pandora data yields mismatches:
subject 1, subject 2

ASSERT:Need to find each trigger in the log expected:<153> but was:<146>
ASSERT:Need to find each trigger in the log expected:<153> but was:<146>
ASSERT:Need to find each trigger in the log expected:<153> but was:<146>
ASSERT:Need to find each trigger in the log expected:<153> but was:<146>
ASSERT:Need to find each trigger in the log expected:<153> but was:<146>
ASSERT:Need to find each trigger in the log expected:<153> but was:<146>
ASSERT:Need to find each trigger in the log expected:<153> but was:<146>
ASSERT:Need to find each trigger in the log expected:<153> but was:<146>

subject 18

ASSERT:Need to find each trigger in the log expected:<153> but was:<146>

loj · 2021-04-29T13:55:25Z

@bpoldrack

Here is the BIDS convention for the <index> value

<index> - a nonnegative integer, possibly prefixed with arbitrary number of 0s for consistent indentation, for example, it is 01 in run-01 following run- specification.

bpoldrack · 2021-04-30T07:16:07Z

Re physio trigger mismatches:

Noting observations for now.

At least for audiomovie, the failing subjects are exactly the ones with sampling frequency 100, while the passing ones have 200. Didn't check pandora yet.

adswa · 2021-04-30T07:18:47Z

Scripts for converting the log files into events.tsv and events.json files are in
data/project/studyforrest/gumpdata/scripts/conversion as reconvert_behavlog_pandora and reconvert_behavtrials_pandora. They are used like this:

# for the trial file
./reconvert_behavtrials_pandora \
    /data/project/studyforrest/pandora/logs/xy22.trials \
   'sub-01_task-avmovie_run-0_events'
# for the log files
 ./reconvert_behavlog_pandora \
    /data/project/studyforrest/pandora/logs/ap75.log \
    'sub-01_task-avmovie_run-0_events'

i.e. <script> <log/trial file> <outpath with placeholder for run>

loj · 2021-04-30T13:59:14Z

I have a README put together, but it requires updating the file paths. Once we have a sample single subject converted dataset, I can start updating the file paths.

bpoldrack · 2021-05-06T09:06:51Z

Update from matrix channel:

First of all, it ran through, yeah! I have not yet looked closely into it. Initial glimpse, says there are (somewhat minor issues):

the behavioral events files of 3 subjects are untracked. Presumably something not quite right with the run call yet.
something is unlocking but not saving again the toplevel task-forrestgump-*.json
the .wavs and swaroop ended up toplevel instead of underneath stimuli.

Those are all "technical" that should be relatively easy to fix and apply those fixes.
Everything else I have not yet assessed. So, please poke the actual data, everybody!

Here it is: /data/project/studyforrest_phase1/phase1-bids.

ping @mih

mih · 2021-05-06T09:11:52Z

It would be good, if this issue gets a checklist to capture what has been looked at and for which conversion attempt. Otherwise it will be rather hard to come to an end.

bpoldrack · 2021-05-06T09:28:35Z

@mih

It would be good, if this issue gets a checklist to capture what has been looked at and for which conversion attempt. Otherwise it will be rather hard to come to an end.

Edited first post.

loj · 2021-05-06T12:42:46Z

Here are the issues I've gathered so far for the recent conversion.

re Missing data types:

These are files that were described by the old README that I haven't managed to find in the newly converted dataset. Some of them are likely elsewhere and don't belong in this dataset, but I wanted to list them just in case.

Raw data:
- fieldmaps
- moco
- angio
- acquisition protocols (Reconversion of phase1 data from raw into bids #29 (comment))
- demographics data: demographics.csv file with "participants' responses to a questionnaire on demographic information, musical preference and background, as well as familiarity with the "Forrest Gump" movie"
- audio-description transcript: german_audio_description.csv
- movie scenes: scenes.csv with start and end time for all 198 scenes in the presented movie cut and if the scene takes place indoors or outdoors.
Derivative data:
- linear anatomical alignment
- non-linear anatomical alignment
- aggregate BOLD functional MRI for brain atlas parcellations
- subject template volumes
- group template volumes

re BIDS compliance:

the subject level event files for the auditoryperception session/task should be one level down underneath the func dir
swi images:
- I think these should go under a separate swi directory.
- We should distinguish which are phase vs mag images. OpenNeuro did this with acq-pha and acq-mag.
- swi/* should be added to the .bidsignore file
the top level bold json files (task-*_acq-*_bold.json) have a few leftover TODOs:
- "CogAtlasID": "TODO",
- "TaskName": "TODO: full task name for auditoryperception",
the top level *.wav files and swaroop should go into a stimuli/ directory (you already are aware of this)
a studyspec.json file is ending up in the top level BIDS dataset
the validator doesn't like the dash used in the recording label for the card/resp files
- *_recording-cardresp-100_physio.json -> *_recording-cardresp100_physio.json
- *_recording-cardresp-200_physio.json -> *_recording-cardresp200_physio.json
anatomical defacemask images need a mod label
- sub-*_ses-forrestgump_run-*_T1w_defacemask.nii.gz -> sub-*_ses-forrestgump_run-*_mod-T1w_defacemask.nii.gz
- sub-*_ses-forrestgump_run-*_T2w_defacemask.nii.gz -> sub-*_ses-forrestgump_run-*_mod-T2w_defacemask.nii.gz
dwi defacemask images need a mod label and should be added to the .bidsignore file
- sub-*_ses-forrestgump_run-*_dwi_defacemask.nii.gz -> sub-*_ses-forrestgump_run-*_mod-dwi_defacemask.nii.gz
- sub-*/*/*/*mod-dwi_defacemask* should then be added to the .bidsignore file

bpoldrack · 2021-05-10T01:09:02Z

Thanks, @loj !

fieldmaps

We decided to not include them. Still the case, @mih?

moco

== dico

angio

Thx, need to investigate. Not intended.

acquisition protocols (Reconversion of phase1 data from raw into bids #29 (comment))

True. Simply forgot to convert the toplevel specs of the raw datasets ;-)
Added.

demographics data: demographics.csv file with "participants' responses to a questionnaire on demographic information, musical preference and background, as well as familiarity with the "Forrest Gump" movie"

audio-description transcript: german_audio_description.csv

movie scenes: scenes.csv with start and end time for all 198 scenes in the presented movie cut and if the scene takes place indoors or outdoors.

@mih : Does that stuff exist in any other place and/or shape other than anondata?

Derivative data:

linear anatomical alignment

non-linear anatomical alignment

aggregate BOLD functional MRI for brain atlas parcellations

subject template volumes

group template volumes

Same here, @mih - no idea, where that even comes from.
So, considering the point above in addition: I guess I'll go with the original idea of an intermediate raw dataset, that includes all that stuff? Or does every part of that unambiguously belong inside pandora, 7T_ad, anatomy? If so, what goes where?

the subject level event files for the auditoryperception session/task should be one level down underneath the func dir

Yes, there's something wrong with the created name.

swi images:

I think these should go under a separate swi directory.

We should distinguish which are phase vs mag images. OpenNeuro did this with acq-pha and acq-mag.

swi/* should be added to the .bidsignore file

We decided to go with veno and acq-pha/acq-mag, since swi is still a BEP and we don't want to break names for users to change to something that isn't settled yet.

the top level bold json files (task-*_acq-*_bold.json) have a few leftover TODOs:

"CogAtlasID": "TODO",

"TaskName": "TODO: full task name for auditoryperception",

Yes, but I have no clue. TaskDescription should probably also be included. "Someone" needs to provide the truth here!

a studyspec.json file is ending up in the top level BIDS dataset

Good catch. That's a hint, that I seem to have screwed with the versions of the raw dataset. That was an (already fixed) bug - explains the toplevel stimuli, too.

the validator doesn't like the dash used in the recording label for the card/resp files

*_recording-cardresp-100_physio.json -> *_recording-cardresp100_physio.json

*_recording-cardresp-200_physio.json -> *_recording-cardresp200_physio.json

Will do.

anatomical defacemask images need a mod label

sub-*_ses-forrestgump_run-*_T1w_defacemask.nii.gz -> sub-*_ses-forrestgump_run-*_mod-T1w_defacemask.nii.gz

sub-*_ses-forrestgump_run-*_T2w_defacemask.nii.gz -> sub-*_ses-forrestgump_run-*_mod-T2w_defacemask.nii.gz

dwi defacemask images need a mod label and should be added to the .bidsignore file

sub-*_ses-forrestgump_run-*_dwi_defacemask.nii.gz -> sub-*_ses-forrestgump_run-*_mod-dwi_defacemask.nii.gz

sub-*/*/*/*mod-dwi_defacemask* should then be added to the .bidsignore file

Ah - need to fix the deface procedure then.

mih · 2021-05-10T05:39:57Z

Thanks, @loj !

fieldmaps

We decided to not include them. Still the case, @mih?

Hard to say, this issue lumps together so many aspects. If these are the fieldmaps that were acquired together with the DWI data at 3T, then yes. They are invalid.

moco

== dico

moco = Motion corrected; dico = distortion corrected

So depending which specific data we are talking about that statement is true (moco is precondition for dico), or not (dico is optional).

angio

Thx, need to investigate. Not intended.

As in "not intended to be there for now"?

demographics data: demographics.csv file with "participants' responses to a questionnaire on demographic information, musical preference and background, as well as familiarity with the "Forrest Gump" movie"

audio-description transcript: german_audio_description.csv

movie scenes: scenes.csv with start and end time for all 198 scenes in the presented movie cut and if the scene takes place indoors or outdoors.

@mih : Does that stuff exist in any other place and/or shape other than anondata?

demographics.csv is an original file that contains data only available on paper.

The two other CSVs are outdated and these old versions are here: https://github.com/psychoinformatics-de/studyforrest-data-annotations/blob/master/old/structure/scenes.csv https://github.com/psychoinformatics-de/studyforrest-data-annotations/blob/master/old/speech/german_audio_description.csv

Derivative data:

linear anatomical alignment

non-linear anatomical alignment

aggregate BOLD functional MRI for brain atlas parcellations

subject template volumes

group template volumes

Same here, @mih - no idea, where that even comes from.
So, considering the point above in addition: I guess I'll go with the original idea of an intermediate raw dataset, that includes all that stuff? Or does every part of that unambiguously belong inside pandora, 7T_ad, anatomy? If so, what goes where?

This is described in the data paper under technical validation. These can all be ignore for the raw dataset, because they are the outcome of a computational pipeline (i.e. derivatives). Everything labeled "alignment/volumes" in the list above is in https://github.com/psychoinformatics-de/studyforrest-data-templatetransforms . The aggregate timeseries are in https://github.com/psychoinformatics-de/studyforrest-data-aggregate From my POV, they can stay there (hosted on GIN).

swi images:

I think these should go under a separate swi directory.

We should distinguish which are phase vs mag images. OpenNeuro did this with acq-pha and acq-mag.

swi/* should be added to the .bidsignore file

We decided to go with veno and acq-pha/acq-mag, since swi is still a BEP and we don't want to break names for users to change to something that isn't settled yet.

Either way is fine with me: One is an established non-standard, the other is the anticipation of a standard.

the top level bold json files (task-*_acq-*_bold.json) have a few leftover TODOs:

"CogAtlasID": "TODO",

"TaskName": "TODO: full task name for auditoryperception",

Yes, but I have no clue. TaskDescription should probably also be included. "Someone" needs to provide the truth here!

As mentioned elsewhere, the task descriptions are in /data/project/studyforrest/anondata/task_key.txt This is how openfmri used to do it. Complete information on how to look up files is available at openfmri.org https://legacy.openfmri.org/data-organization-old/

I don't think anyone has looked up the IDs for these tasks on http://www.cognitiveatlas.org/ yet.

the validator doesn't like the dash used in the recording label for the card/resp files

*_recording-cardresp-100_physio.json -> *_recording-cardresp100_physio.json

*_recording-cardresp-200_physio.json -> *_recording-cardresp200_physio.json

Will do.

Thx, looks OK to me.

adswa added the data label Apr 26, 2021

loj added the BIDS label Apr 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reconversion of phase1 data from raw into bids #29

Reconversion of phase1 data from raw into bids #29

mih commented Apr 22, 2021 •

edited by bpoldrack

Loading

bpoldrack commented Apr 22, 2021 •

edited

Loading

mih commented Apr 22, 2021

loj commented Apr 26, 2021

bpoldrack commented Apr 27, 2021

bpoldrack commented Apr 27, 2021

bpoldrack commented Apr 27, 2021 •

edited

Loading

adswa commented Apr 29, 2021

adswa commented Apr 29, 2021

bpoldrack commented Apr 29, 2021

loj commented Apr 29, 2021

bpoldrack commented Apr 29, 2021

bpoldrack commented Apr 29, 2021

adswa commented Apr 29, 2021 •

edited

Loading

loj commented Apr 29, 2021

adswa commented Apr 29, 2021

loj commented Apr 29, 2021

bpoldrack commented Apr 30, 2021

adswa commented Apr 30, 2021

loj commented Apr 30, 2021

bpoldrack commented May 6, 2021 •

edited

Loading

mih commented May 6, 2021

bpoldrack commented May 6, 2021

loj commented May 6, 2021

bpoldrack commented May 10, 2021

mih commented May 10, 2021

Reconversion of phase1 data from raw into bids #29

Reconversion of phase1 data from raw into bids #29

Comments

mih commented Apr 22, 2021 • edited by bpoldrack Loading

Make the code from 2013 run reproducibly

Redo the conversion with modern day tooling

bpoldrack commented Apr 22, 2021 • edited Loading

mih commented Apr 22, 2021

loj commented Apr 26, 2021

bpoldrack commented Apr 27, 2021

bpoldrack commented Apr 27, 2021

bpoldrack commented Apr 27, 2021 • edited Loading

adswa commented Apr 29, 2021

adswa commented Apr 29, 2021

bpoldrack commented Apr 29, 2021

loj commented Apr 29, 2021

bpoldrack commented Apr 29, 2021

bpoldrack commented Apr 29, 2021

adswa commented Apr 29, 2021 • edited Loading

loj commented Apr 29, 2021

adswa commented Apr 29, 2021

loj commented Apr 29, 2021

bpoldrack commented Apr 30, 2021

adswa commented Apr 30, 2021

loj commented Apr 30, 2021

bpoldrack commented May 6, 2021 • edited Loading

mih commented May 6, 2021

bpoldrack commented May 6, 2021

loj commented May 6, 2021

bpoldrack commented May 10, 2021

mih commented May 10, 2021

mih commented Apr 22, 2021 •

edited by bpoldrack

Loading

bpoldrack commented Apr 22, 2021 •

edited

Loading

bpoldrack commented Apr 27, 2021 •

edited

Loading

adswa commented Apr 29, 2021 •

edited

Loading

bpoldrack commented May 6, 2021 •

edited

Loading