Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wilms Tumor Dataset Annotation (SCPCP000006) #635 #671

Open
maud-p opened this issue Jul 29, 2024 · 2 comments
Open

Wilms Tumor Dataset Annotation (SCPCP000006) #635 #671

maud-p opened this issue Jul 29, 2024 · 2 comments
Labels

Comments

@maud-p
Copy link
Contributor

maud-p commented Jul 29, 2024

Please link to the GitHub Discussion for this proposed analysis.

#635 (reply in thread)

Describe the goals of this analysis module.

Here, we first aim to annotate the Wilms Tumor snRNA-seq samples in the SCPCP000006 (n=40) dataset. To do so we will:
• Provide annotations of normal cells composing the kidney, including normal kidney epithelium, endothelium, stroma and immune cells
• Provide annotations of tumor cell populations that may be present in the WT samples, including blastemal, epithelial, and stromal populations of cancer cells
Based on the provided annotation, we would like to additionally provide a reference of marker genes for the three cancer cell populations, which is so far lacking for the WT community.

The analysis will be divided as the following:

  1. Metadata file: compilation of a metadata file of marker genes for expected cell types that will be used for validation at a later step
  2. Script: clustering of cells across a set of parameters for few samples
  3. Script: label transfer from the fetal kidney atlas reference using runAzimuth
  4. Script: run InferCNV
  5. Notebook: explore results from steps 2 to 4 for about 5 to 10 samples
  6. Script: compile scripts 2 to 4 in a RMardown file with required adjustements and render it across all samples
  7. Notebook: explore results from step 6, integrate all samples together and annotate the dataset using (i) metadatafile, (ii) CNV information, (iii) label transfer information

What software will you require?

I will use RStudio build with a Docker image from the base image rocker/tidyverse:4.3.0
BiocManager version = "3.17"

main packages used are:

Seurat version 5
Azimuth version 5
inferCNV
SCpubr for visualization
DT for table visualization
DElegate for differential expression analysis

What will your first pull request contain?

the first pull request will be a metadata file containing a list of marker genes for expected cell types
The table will contain the following column:

gene symbol
gene ENSEMBL id
cell type specificity
reference: DOI id of related publication

What computational resources will you require?

I will use our own machine and computational resources.

If known, when do you expect to file the first pull request?

~01/08/2024

@jashapiro
Copy link
Member

jashapiro commented Jul 29, 2024

Thank your for filing this issue with your plans!

I will use RStudio build with a Docker image from the base image rocker/tidyverse:4.3.0
BiocManager version = "3.17"

For best compatibility with the other packages currently in use, you might consider using Bioconductor 3.19 and R 4.4. We use these in part because of a known security vulnerability in R <4.4.

For easiest implementation that saves on some installation time, you might consider using the bioconductor/tidyverse:3.19 image for your development.

@maud-p
Copy link
Contributor Author

maud-p commented Jul 29, 2024

Good to know, thank you very much! I'll build a docker image based on bioconductor/tidyverse:3.19 then before starting the module-2!

maud-p added a commit to maud-p/OpenScPCA-analysis that referenced this issue Jul 30, 2024
@maud-p maud-p mentioned this issue Aug 1, 2024
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants