-
Notifications
You must be signed in to change notification settings - Fork 1
BCDC Upload Detailed Example
This is a detailed example of what needs to be done for an initial ingest of transcriptomics data by the U01/U19 user (the User), the R24 archive (the Archive), and the BCDC metadata repository (BCDC).
When the User collects their data, the User will generate files that will need to be uploaded to the Archive.
In order to upload data to BCDC, the User must be set up with accounts and reference objects. For each grant, a namespace will be created to contain the metadata associated with the experimental data. The User should work with a representative of BCDC in order to set up the reference objects for the User. These reference objects should represent entities in ontology that include (not a complete list)
- grants
- organizations
- protocols
- species
- sex
- techniques
- archives
These reference objects must be created before they can be used when metadata is uploaded to BCDC.
For each experimental file, the User must associate metadata with each file. The metadata that the User creates should represent entities in the ontology that include (not a complete list)
- files
- feature sets
- features
- processes
- observations
- specimens
- genome alignments
- cell phenotypes
- projects
The User should refer to the ontology documentation to know how to model the information about the experiment to these ontological entities.
In order to confirm that the Archive can receive all experimental files correctly, a file manifest must be created that lists all experimental files with validation information. BCDC may create a script that can be used to create this file manifest from a directory of files.
Here is more information about the File Manifest.
Each Archive will provide instructions how to upload experimental data to its data store. Along with the experimental data, the User will also upload a file manifest to the Archive.
BCDC will provide an API interface to allow the User to upload their metadata to BCDC. The User will upload metadata to BCDC in this order (to preserve referential integrity).
- cell phenotypes
- specimens
- processes
- observations
- feature sets
- features
- genome alignments
- projects
- files
BCDC will provide an API interface for the Archive to retrieve the metadata for a file. A bulk retrieval interface may be also be implemented.
The Archive will provide an API interface to retrieve the URL for each file. Other file-specific metadata present in the file manifest may also be included in this retrieval. BCDC may call this endpoint in order to provide a download link in a web interface.