Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sf to file format #402

Merged

Conversation

adamjtaylor
Copy link
Contributor

Closes #399

@adamjtaylor adamjtaylor linked an issue May 9, 2024 that may be closed by this pull request
2 tasks
@adamjtaylor adamjtaylor requested a review from aditigopalan May 9, 2024 08:34
@adamjtaylor
Copy link
Contributor Author

Model integrity / Check attributes are unique fails:

"Panel Name" appears 2 times in the data model
"Total Number of Cells" appears 2 times in the data model
"Total Number of Targets" appears 2 times in the data model
"Experiment IF Channels" appears 2 times in the data model
"Transcripts per Cell" appears 2 times in the data model
"Percent of Transcripts within Cells" appears 2 times in the data model
"Panel Name" appears 2 times in the data model
"Total Number of Cells" appears 2 times in the data model
"Total Number of Targets" appears 2 times in the data model
"Experiment IF Channels" appears 2 times in the data model
"Transcripts per Cell" appears 2 times in the data model
"Percent of Transcripts within Cells" appears 2 times in the data model

I am guessing we didn't properly resolve conflicts when merging CosMX and Xenium.

@adamjtaylor
Copy link
Contributor Author

We will address the model integrity failure in #402. OK to merge if the JSON-ld only shows a change in the added sf valid value

@@ -6,7 +6,7 @@ Component,"Category of metadata (e.g. Diagnosis, Biospecimen, scRNA-seq Level 1,
Patient,HTAN patient,,"Component, HTAN Participant ID",,FALSE,Individual Organism,"Demographics, Family History, Exposure, Follow Up, Diagnosis, Therapy, Molecular Test",,
File,A type of Information Content Entity specific to OS,,,,FALSE,Information Content Entity,,https://w3id.org/biolink/vocab/DataFile,
Filename,Name of a file,,,,TRUE,,,,regex search ^.+\/\S*$
File Format,"Format of a file (e.g. txt, csv, fastq, bam, etc.)","hdf5, bedgraph, idx, idat, bam, bai, excel, powerpoint, tif, tiff, OME-TIFF, png, doc, pdf, fasta, fastq, sam, vcf, bcf, maf, bed, chp, cel, sif, tsv, csv, txt, plink, bigwig, wiggle, gct, bgzip, zip, seg, html, mov, hyperlink, svs, md, flagstat, gtf, raw, msf, rmd, bed narrowPeak, bed broadPeak, bed gappedPeak, avi, pzfx, fig, xml, tar, R script, abf, bpm, dat, jpg, locs, Sentrix descriptor file, Python script, sav, gzip, sdf, RData, hic, ab1, 7z, gff3, json, sqlite, svg, sra, recal, tranches, mtx, tagAlign, dup, DICOM, czi, mex, cloupe, am, cell am, mpg, m, mzML,scn, dcc, rcc, pkc",,,TRUE,,,,
File Format,"Format of a file (e.g. txt, csv, fastq, bam, etc.)","hdf5, bedgraph, idx, idat, bam, bai, excel, powerpoint, tif, tiff, OME-TIFF, png, doc, pdf, fasta, fastq, sam, vcf, bcf, maf, bed, chp, cel, sif, tsv, csv, txt, plink, bigwig, wiggle, gct, bgzip, zip, seg, html, mov, hyperlink, svs, md, flagstat, gtf, raw, msf, rmd, bed narrowPeak, bed broadPeak, bed gappedPeak, avi, pzfx, fig, xml, tar, R script, abf, bpm, dat, jpg, locs, Sentrix descriptor file, Python script, sav, gzip, sdf, RData, hic, ab1, 7z, gff3, json, sqlite, svg, sra, recal, tranches, mtx, tagAlign, dup, DICOM, czi, mex, cloupe, am, cell am, mpg, m, mzML,scn, dcc, rcc, pkc, sf",,,TRUE,,,,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the addition

@@ -122,7 +122,7 @@ GeoMx DSP Workflow Parameter Description,Parameters used to run the GeoMx DSP wo
GeoMx DSP Workflow Link,Link to workflow or command. DockStore.org recommended. URL,,,,FALSE,Spatial Transcriptomics,,,
NanoString GeoMx DSP ROI RCC Segment Annotation Metadata,GeoMx ROI and Segment Metadata Attributes. The assayed biospecimen should be reported one per row with the associated ROI coordinates. ,,"HTAN Parent Biospecimen ID, Scan name, ROI name, Segment name, ROI X Coordinate,ROI Y Coordinate, Tags, QC status, Scan Height, Scan Width, Scan Offset X, Scan Offset Y, Binding Density, Positive norm factor, Surface area, Nuclei count, Tissue Stain",,FALSE,Assay,,,
Scan name,GeoMx Scan name (as appears in Segment Summary),,,,TRUE,"NanoString GeoMx DSP ROI RCC Segment Annotation Metadata, NanoString GeoMx DSP ROI DCC Segment Annotation Metadata",,,
ROI name,"ROI name (application generated). For Xenium this is referred to as the “region name”",,,,TRUE,"NanoString GeoMx DSP ROI RCC Segment Annotation Metadata, NanoString GeoMx DSP ROI DCC Segment Annotation Metadata",,,
ROI name,ROI name (application generated). For Xenium this is referred to as the “region name”,,,,TRUE,"NanoString GeoMx DSP ROI RCC Segment Annotation Metadata, NanoString GeoMx DSP ROI DCC Segment Annotation Metadata",,,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and below these are all OK as they are just dropping superfluous quotes per the CSV linter

Copy link
Contributor

@aditigopalan aditigopalan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@aditigopalan aditigopalan merged commit eaf24c5 into main May 9, 2024
@aditigopalan aditigopalan deleted the 399-bulkrnaseq-level-2-updates-to-accommodate-salmon-files branch May 9, 2024 12:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BulkRNAseq Level 2] Updates to accommodate Salmon files
2 participants