Skip to content

Commit

Permalink
adding more info on public resoutrces
Browse files Browse the repository at this point in the history
  • Loading branch information
mistrm82 authored May 23, 2024
1 parent 4643e71 commit c755a34
Showing 1 changed file with 4 additions and 1 deletion.
5 changes: 4 additions & 1 deletion lessons/01_data_organization.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,17 +21,20 @@ Cancer is the [second leading cause of death globally](https://www.who.int/healt
Image source: Johannessen CM and Boehm JS. Curr, Current Opinions in Systems Biology 2017
</p>

There is a vast amount of genomic data deposited in public repositories which are available to researchers. These resources involve a range of large scale datasets and analysis tools, and require differing levels of computational expertise among users. Access to these resources allows us to:
There is a vast amount of **cancer genomic data deposited in public repositories** which are available to researchers. These resources involve a range of large scale datasets and analysis tools, and require differing levels of computational expertise among users. Access to these resources allows us to:

* Obtain data to re-analyze and explore different questions posed from original studies
* Compare our results to large cancer databases for variant annotation
* Obtain reference datasets for benchmarking of variant calling algorithms

### [Genomics Data Commons](https://portal.gdc.cancer.gov/)
The GDC is a data repository funded by the National Cancer Institute (NCI) which provides researchers with access to genomic and clinical data from cancer patients. There is no original research conducted as part of the GDC; their main purpose is to provide centralized access to the data generated by other projects for broader research use. It contains data submitted by researchers and large scale cancer sequencing projects (such as the TCGA). The datasets go beyond whole genome sequencing, with data from RNA-seq, proteomics, imaging and other modalities. Researchers can access and query the data through the portal using built-in analysis tools, or raw data can be obtained after getting authorized access.

### [The Cancer Genome Atlas(TCGA)](https://www.cancer.gov/ccg/research/genome-sequencing/tcga)
The TCGA is a joint effort between the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI). The TCGA has profiled and analyzed large numbers of human tumors (across 33 different cancer types) at the DNA, RNA , protein and epigenetic levels. Researchers at the TCGA conduct the original research which includes collecting tumor samples, performing various omics analyses, and analyzing the resulting data. TCGA research has led to many discoveries about the molecular basis of cancer and identified potential biomarkers and therapeutic targets.

### [cBioPortal](https://www.cbioportal.org/)
cBioPortal is a free online resource for exploring, visualizing and analyzing cancer genomics data.

### The ICGC-TCGA DREAM Mutation Calling Challenge

Expand Down

0 comments on commit c755a34

Please sign in to comment.