Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ccRCC snRNA-seq data used in this paper? #6

Open
YushaLiu opened this issue May 15, 2024 · 1 comment
Open

ccRCC snRNA-seq data used in this paper? #6

YushaLiu opened this issue May 15, 2024 · 1 comment

Comments

@YushaLiu
Copy link

I have a quick question about the snRNA-seq data of CCRCC samples that were used in this study. I downloaded the data from https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE240822, following the data availability section in the paper, but noticed that for each patient sample (e.g., C3L-00004-T1), the matrix containing the UMI counts (matrix.mtx.gz) often has about 1 million columns (which are barcodes). Does each barcode in this matrix represent a single nucleus? If so, why is the number of barcodes way larger than the number of nuclei based on the annotation file GSE240822_GBM_ccRCC_RNA_metadata_CPTAC_samples.tsv?

@nvterekhanova
Copy link
Collaborator

Hi @YushaLiu,

The matrix files .mtx.gz correspond to raw feature-barcode matrix files, that are outputs from cellranger (https://support.10xgenomics.com/single-cell-atac/software/pipelines/latest/output/matrices), and they contain all barcodes before filtering. And the annotation files contain cell barcodes after filtering, so that is why there is such a big difference in barcode numbers between those files.

Nadezhda

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants