Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiplex cite-seq added to data model. #410

Merged
merged 7 commits into from
Jun 6, 2024
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions HTAN.model.csv
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,10 @@ scmC-seq Level 2,"Files contain scmC-seq files containing aligned sequence data,
scATAC-seq Level 4,"Data represents the relationships between cells derived from Level 3 expression data and shown as tSNE or UMAP coordinates per cell, plus all other cell-specific meta information (e.g., cell type)",,"Component, Filename, File Format, HTAN Parent Data File ID, HTAN Data File ID, scATACseq Workflow Type, scATACseq Workflow Parameters Description, Workflow Version, Workflow Link",,FALSE,Sequencing,scATAC-seq Level 3,,
scDNA-seq Level 1,Single-cell DNA-seq,,"Component, Filename, File Format, HTAN Parent Biospecimen ID, HTAN Data File ID, Sequencing Batch ID, Library Layout, Nucleic Acid Source, Library Selection Method, Read Length, Library Preparation Kit Name, Library Preparation Kit Vendor, Library Preparation Kit Version, Adapter Name, Adapter Sequence, Base Caller Name, Base Caller Version, Flow Cell Barcode, Fragment Maximum Length, Fragment Mean Length, Fragment Minimum Length, Fragment Standard Deviation Length, Lane Number, Library Strand, Multiplex Barcode, Size Selection Range, Target Depth, To Trim Adapter Sequence, Adapter Content, Basic Statistics, Encoding, Kmer Content, Overrepresented Sequences, Per Base N Content, Per Base Sequence Content, Per Base Sequence Quality, Per Sequence GC Content, Per Sequence Quality Score, Per Tile Sequence Quality, Percent GC Content, Sequence Duplication Levels, Sequence Length Distribution, Total Reads, QC Workflow Type, QC Workflow Version, QC Workflow Link",,FALSE,Sequencing,Biospecimen,,
scDNA-seq Level 2,Alignment workflows downstream of scDNA-seq Level 1,,"Component, Filename, File Format, HTAN Parent Data File ID, HTAN Data File ID, Alignment Workflow Url, Alignment Workflow Type, Genomic Reference, Genomic Reference URL, Index File Name, Average Base Quality, Average Insert Size, Average Read Length, Mean Coverage, Pairs On Diff CHR, Total Reads, Proportion Reads Mapped, MapQ30, Total Uniquely Mapped, Total Unmapped reads,Proportion Reads Duplicated, Short Reads, Proportion Coverage 10x, Proportion Coverage 30X, Proportion Targets No Coverage, Proportion Base Mismatch, Proportion Mitochondrial Reads, Contamination, Contamination Error",,FALSE,Sequencing,scDNA-seq Level 1,,
Multiplexed CITE-seq Level 1,Raw sequencing files for the multiplexed CITE-seq assay,,"Component, Filename, File Format, HTAN Parent Biospecimen ID, HTAN Data File ID, Nucleic Acid Source,Cryopreserved Cells in Sample, Single Cell Isolation Method, Dissociation Method, Library Construction Method,Read Indicator, Read1, Read2, cDNA, End Bias, Reverse Transcription Primer, Spike In, Spike In Concentration, Sequencing Platform, Total Number of Input Cells, Input Cells and Nuclei, Library Preparation Days from Index, Single Cell Dissociation Days from Index, Sequencing Library Construction Days from Index, Nucleic Acid Capture Days from Index, Protocol Link, Technical Replicate Group, Empty Well Barcode,Well Index,Feature Reference Id, Associated mRNA Library Data File ID, Single Cell Barcode Method Applied, Feature Barcode Library Type, Barcode Folder Synapse ID, Barcode Folder File List",,FALSE,Sequencing,Biospecimen,,
Multiplexed CITE-seq Level 2,Alignment workflows downstream of Multiplexed CITE-seq Level 1,,"Component, Filename, File Format, HTAN Parent Biospecimen ID, HTAN Parent Data File ID, HTAN Data File ID, Associated mRNA Library Data File ID, scRNAseq Workflow Type, Workflow Version, scRNAseq Workflow Parameters Description, Workflow Link, Genomic Reference, Genomic Reference URL, Genome Annotation URL, Checksum, Whitelist Cell Barcode File Link, Cell Barcode Tag, UMI Tag, Applied Hard Trimming",,FALSE,Sequencing,Multiplexed CITE-seq Level 1,,
Multiplexed CITE-seq Level 3,Gene and Isoform expression files,,"Component, Filename, File Format, HTAN Parent Data File ID, HTAN Parent Biospecimen ID, HTAN Data File ID, Associated mRNA Library Data File ID, Data Category, Matrix Type, Linked Matrices, Cell Median Number Reads, Cell Median Number Genes, Cell Total, scRNAseq Workflow Type, scRNAseq Workflow Parameters Description, Workflow Link, Workflow Version",,FALSE,Sequencing,scRNA-seq Level 2,,
Multiplexed CITE-seq Level 4,"Data represents the relationships between cells derived from Level 3 expression data and shown as tSNE or UMAP coordinates per cell, plus all other cell-specific meta information (e.g., cell type)",,"Component, Filename, File Format, HTAN Parent Data File ID, HTAN Parent Biospecimen ID, HTAN Data File ID, Associated mRNA Library Data File ID, scRNAseq Workflow Type, scRNAseq Workflow Parameters Description, Workflow Version, Workflow Link",,FALSE,Sequencing,Multiplexed CITE-seq Level 3,,
Bulk Methylation-seq Level 1,"Raw data for bulk methylation sequencing, such as FASTQs and unaligned BAMs",,"Component, Filename, File Format, HTAN Parent Biospecimen ID, HTAN Data File ID, Nucleic Acid Source, Bisulfite Conversion, Sequencing Platform, Replicate Type, Bulk Methylation Assay Type, Total DNA Input",,FALSE,Sequencing,Biospecimen,,
Bulk Methylation-seq Level 2,"Aligned primary data for bulk methylation sequencing, such as gene expression matrix files, VCFs, etc.",,"Component, Filename, File Format, HTAN Parent Data File ID, HTAN Data File ID, Alignment Workflow Url, Trimmer, Bulk Methylation Genomic Reference, Genomic Reference URL, Index File Name, Alignment Workflow Type, Duplicate Removal Software, Mean Coverage, Library Layout, Average Base Quality, Average Insert Size, Average Read Length, Contamination, Contamination Error, Pairs On Diff CHR, Total Reads, Total Uniquely Mapped, Total Unmapped reads, Proportion Reads Duplicated, Proportion Reads Mapped, Proportion Targets No Coverage, Proportion Base Mismatch, Short Reads, Proportion of Minimum CpG Coverage 10X, Proportion Coverage 30X",,FALSE,Sequencing,"Bulk Methylation-seq Level 1, Biospecimen",,
Bulk Methylation-seq Level 3,"Sample level summary data for bulk methylation sequencing, such as t-SNE plot coordinates, etc.",,"Component, Filename, File Format, HTAN Parent Data File ID, HTAN Data File ID,DMC Calling Tool, DMC Calling Workflow URL, DMR Calling Tool, DMR Calling Workflow URL, pUC19 methylation ratio, Lambda methylation ratio, DMC data file format, DMR data file Format",,FALSE,Sequencing,"Bulk Methylation-seq Level 2, Biospecimen",,
Expand Down Expand Up @@ -1030,3 +1034,8 @@ Barretts Esophagus Goblet Cells Present,Presence or absennce of Barretts esophag
Pancreatitis Onset Year,Date of onset of pancreatitis.,,,,FALSE,Follow Up,,,num
HTAN Parent Channel Metadata ID,HTAN ID for a level 3 channels table.,,,,TRUE, Imaging Level 4,,,
Single Nucleus Capture,Nuclei isolation method,"Plates, 10x, droplet",,,FALSE,scmC-seq Level 1,,,
Associated mRNA Library Data File ID,Sample Level HTAN Data File ID for the associated level - HTAN ID of this file HTAN ID SOP (eg HTANx_yyy_zzz),,,,TRUE,Multiplexed CITE-seq Level 1,,,
adamjtaylor marked this conversation as resolved.
Show resolved Hide resolved
Single Cell Barcode Method Applied,The method by which cells are multiplex or labeled with cell surface markers or probes,"Cite-seq custom panel,Biolegend Total seq A Custom,Biolegend Total seq B Custom,Biolegend Total seq C Custom,Biolegend Total seq A Human,Universal cocktail V1.0,Biolegend Total seq B Human Universal cocktail V1.0,Biolegend Total seq C Human Universal cocktail V1.0,Cell,Hashtag- multiplex,Nuclear Hashtag- multiplex,Other",,,TRUE,Multiplexed CITE-seq Level 1,,,
Feature Barcode Library Type,The library construction methods for the feature barcode library,"10x ADT – citeseq/hashing,10x TCR,10x BCR",,,TRUE,Multiplexed CITE-seq Level 1,,,
Barcode Folder Synapse ID,Synapse ID of the folder containing the barcode lists,,,,TRUE,Multiplexed CITE-seq Level 1,,,
adamjtaylor marked this conversation as resolved.
Show resolved Hide resolved
Barcode Folder File List,A comma separated list of filenames in the gzipped folder detailing what barcodes are specific to demultiplexing samples versus providing surface protein data,,,,TRUE,Multiplexed CITE-seq Level 1,,,
Loading