diff --git a/HTAN.model.csv b/HTAN.model.csv index e60ec5af..4be36706 100644 --- a/HTAN.model.csv +++ b/HTAN.model.csv @@ -58,6 +58,8 @@ Bulk RNA-seq Level 3,Bulk RNA-seq gene expression matrices,,"Component, Filename Bulk WES Level 1,Bulk Whole Exome Sequencing raw files,,"Component, Filename, File Format, HTAN Parent Biospecimen ID, HTAN Data File ID, Sequencing Batch ID, Library Layout, Read Indicator, Library Selection Method, Read Length, Target Capture Kit, Library Preparation Kit Name, Library Preparation Kit Vendor, Library Preparation Kit Version, Sequencing Platform, Adapter Name, Adapter Sequence, Base Caller Name, Base Caller Version, Flow Cell Barcode, Fragment Maximum Length, Fragment Mean Length, Fragment Minimum Length, Fragment Standard Deviation Length, Lane Number, Multiplex Barcode, Library Preparation Days from Index, Size Selection Range, Target Depth, To Trim Adapter Sequence",,FALSE,Sequencing,Biospecimen,, Bulk WES Level 2,Bulk Whole Exome Sequencing aligned files and QC,,"Component, Filename, File Format, HTAN Parent Data File ID, HTAN Data File ID, Alignment Workflow Type, Genomic Reference, Genomic Reference URL, Index File Name, Average Base Quality, Average Insert Size, Average Read Length, Contamination, Contamination Error, Mean Coverage, Adapter Content, Basic Statistics, Encoding, Overrepresented Sequences, Per Base N Content, Per Base Sequence Content, Per Base Sequence Quality, Per Sequence GC Content, Per Sequence Quality Score, Per Tile Sequence Quality, Percent GC Content, Sequence Duplication Levels, Sequence Length Distribution, QC Workflow Type, QC Workflow Version, QC Workflow Link, MSI Workflow Link, MSI Score, MSI Status, Pairs On Diff CHR, Total Reads, Total Uniquely Mapped, Total Unmapped reads, Proportion Reads Duplicated, Proportion Reads Mapped, Proportion Targets No Coverage, Proportion Base Mismatch, Short Reads, Proportion Coverage 10x, Proportion Coverage 30X,Is lowest level",,FALSE,Sequencing,Bulk WES Level 1,, Bulk WES Level 3,Bulk Whole Exome Sequencing called variants,,"Component, Filename, File Format, HTAN Parent Data File ID, HTAN Data File ID, Genomic Reference, Genomic Reference URL, Germline Variants Workflow URL, Germline Variants Workflow Type, Somatic Variants Workflow URL, Somatic Variants Workflow Type, Somatic Variants Sample Type, Structural Variant Workflow URL, Structural Variant Workflow Type",,FALSE,Sequencing,Bulk WES Level 2,, +Microarray Level 1,Microarray Level 1 refers to the raw text table of probe level intensities,,"Component, Filename, File Format, HTAN Data File ID, HTAN Participant ID, HTAN Parent Biospecimen ID, Nucleic Acid Source, Microarray Platform ID, Microarray Molecule, Microarray Label, Microarray Value Definition, Microarray Protocol Auxiliary File",,FALSE,Assay,Biospecimen,, +Microarray Level 2,Microarray Level 2 provides a normalized matrix of values.,,"Component, Filename, File Format, HTAN Participant ID, HTAN Parent Biospecimen ID, HTAN Parent Data File ID, HTAN Data File ID, Microarray Platform ID, Normalization Method",,FALSE,Assay,Microarray Level 1,, scATAC-seq Level 1,"scATAC-seq files containing sequence read information, with or without alignment, as FASTQ or BAM files",,"Component, Filename, File Format, HTAN Parent Biospecimen ID, HTAN Data File ID, Nucleic Acid Source, Dissociation Method, Single Nucleus Buffer, Single Cell Isolation Method, Transposition Reaction, scATACseq Library Layout, Nucleus Identifier, Nuclei Barcode Length, Nuclei Barcode Read, scATACseq Read1, scATACseq Read2, scATACseq Read3, Library Construction Method, Sequencing Platform, Threshold for Minimum Passing Reads, Total Number of Passing Nuclei, Median Fraction of Reads in Peaks, Median Fraction of Reads in Annotated cis DNA Elements, Median Passing Read Percentage, Median Percentage of Mitochondrial Reads per Nucleus,Technical Replicate Group, Total Reads, Protocol Link",,FALSE,Sequencing,Biospecimen,, scATAC-seq Level 2,"scATAC-seq files containing aligned sequence data, as a BAM file",,"Component, Filename, File Format, HTAN Parent Data File ID, HTAN Data File ID, Alignment Workflow Url, Alignment Workflow Type, Genomic Reference, Genomic Reference URL, Index File Name, Average Base Quality, Average Insert Size, Average Read Length, Mean Coverage, Pairs On Diff CHR, Total Reads, Proportion Reads Mapped, MapQ30, Total Uniquely Mapped, Total Unmapped reads, Proportion Reads Duplicated, Short Reads, Proportion Coverage 10x, Proportion Coverage 30X, Proportion Targets No Coverage, Proportion Base Mismatch, Median Percentage of Mitochondrial Reads per Nucleus, Contamination,Contamination Error",,FALSE,Sequencing,scATAC-seq Level 1,, scATAC-seq Level 3,Processed data files containing peak information for cells,,"Component, Filename, File Format, HTAN Parent Data File ID, HTAN Data File ID, scATAC-seq Object ID, nCount Peaks, nFeature Peaks, Total Read-Pairs, Duplicate Read-Pairs, Chimeric Read-Pairs, Unmapped Read-Pairs, LowMapQ, Mitochondrial Read-Pairs, Passed Filters, TSS Fragments, DNase Sensitive Region Fragments, Enhancer Region Fragments, Promoter Region Fragments, On Target Fragments, Blacklist Region Fragments, Peak Region Fragments, Peak Region Cutsites, Nucleosome Signal, Nucleosome Percentile, TSS Enrichment, TSS Percentile, Pct Reads in Peaks, Blacklist Ratio, Seurat Clusters, nCount RNA, nFeature RNA, MACS2 Seqnames, MACS2 Start, MACS2 End, MACS2 Width, MACS2 Strand, MACS2 Name, MACS2 Score, MACS2 Fold Change, MACS2 Neg Log10 pvalue Summit, MACS2 Neg Log10 qvalue Summit, MACS2 Relative Summit Position",,FALSE,Sequencing,scATAC-seq Level 2,, @@ -1030,3 +1032,8 @@ Barretts Esophagus Goblet Cells Present,Presence or absennce of Barretts esophag Pancreatitis Onset Year,Date of onset of pancreatitis.,,,,FALSE,Follow Up,,,num HTAN Parent Channel Metadata ID,HTAN ID for a level 3 channels table.,,,,TRUE, Imaging Level 4,,, Single Nucleus Capture,Nuclei isolation method,"Plates, 10x, droplet",,,FALSE,scmC-seq Level 1,,, +Microarray Protocol Auxiliary File,"Auxiliary file describing the experimental protocols used, as described in the NCBI GEO microarray template, recorded as synapse ID (syn12345).",,"Component, Filename, File Format, HTAN Parent Data File ID, HTAN Data File ID",,FALSE,Assay,Microarray Level 1,, +Microarray Platform ID, NCBI GEO Microarray Platform ID, i.e. Agilent Design ID, linking to the table containing the array definition,,,,TRUE,Assay,Microarray Level 1,, +Microarray Molecule, Microarray is measuring this kind of molecule, "DNA,RNA",,,TRUE,Assay,Microarray Level 1,, +Microarray Label, Microarray used this kind of label,,,,TRUE,Assay,Microarray Level 1,, +Microarray Value Definition,What the provided value signifies,,,,TRUE,Assay,Microarray Level 1,,