CHORD has not been tested extensively on long-reads dataset, so the prediction may not be accurate.
In particular, we observed that CHORD can predict wrong results for samples with < 15X effective tumor coverage (effective tumor coverage = tumor coverage * tumor purity).
Sample
Probability of BRCA1-type HRD
Probability of BRCA2-type HRD
Probability of HRD
HRD status
HRD type
Remarks on HRD status
Remarks on HRD type
COLO829_60-30
0
0
0
HR_proficient
none
NA
NA
Expand
Whole-genome copy number profile (Purple)
For visualization purpose, if the major copy number is more than 5, the plot is capped at 5.
Expand
Small variants (SNV/INDEL) coverage and variant allele frequency (VAF) distribution
Expand
Mutational signatures
Mutational signature is estimated using R package MutationalPattern based on SNVs only (INDELs are ignored).
Expand
Notes on small variants (SNV/INDEL) filtering
Variants are filtered with any of the following criteria:
IMPACT is HIGH
Existing_variation contains COS (COSMIC variants)
CLIN_SIG contains pathogenic
CANCER_TYPE is not NA (variants that are in IntOGen Cancer Gene Census)
MAX_AF (maximum population allele frequency) is less than 3%
CANCER_TYPE_ROLE and CANCER_TYPE_CGC_GENE are merged columns from CANCER_TYPE, ROLE and CGC_CANCER_GENE. These columns are collapsed into single entries separated by semicolon. E.g. CANCER_TYPE = “Breast;Prostate” and ROLE - “LoF;Act” means that the gene is a LoF in breast cancer and an Act in prostate cancer. This is done so that the table is more readable.
Expand
Small variants (SNV/INDEL) table
Expand
Notes on structural variants (SVs)
SVs are filtered to only those that are part of the IntOGen Cancer Gene Census (CGC)
Annotation based on AnnotSV. However to make the output readable some columns with very long information (e.g. “_coord” and “_source”) are removed. Please refer to original AnnotSV output for more information.
Capital letter columns are from IntOGen CGC. Please see README from the IntOGen release for more information.
CANCER_TYPE_ROLE and CANCER_TYPE_CGC_GENE are merged columns from CANCER_TYPE, ROLE and CGC_CANCER_GENE. These columns are collapsed into single entries separated by semicolon. E.g. CANCER_TYPE = “Breast;Prostate” and ROLE - “LoF;Act” means that the gene is a LoF in breast cancer and an Act in prostate cancer. This is done so that the table is more readable.
Each SV can affect multiple genes. AnnotSV “splits” the different genes into different entries. This is why there are multiple rows with the same AnnotSV_ID.
ALT allele for insertion is hidden as “Too long” in the table. Please refer to the original AnnotSV output for more information.
Note that Severus can call duplication as BND event, and AnnotSV has a tendency to annotate these as DEL event since it doesn’t make use of the “STRAND” information. Therefore, the “SV_type” column is not very accurate for BND events (You will recognize these with SEVERUS_BND in the ID column)
The “SAMPLE” column represents the FORMAT column in the VCF. For Severus this is “GT:GQ:VAF:hVAF:DR:DV”
Expand
Structural variants (SVs) table
Expand
Notes on DMR filtering
The table shows DMRs overlapping with promoters of genes in the IntOGen Cancer Gene Census (CGC) in the pipeline output generated using DSS.
Only DMRs with nCG >= 50 and are overlapping with known promoter regions (annotated using annotatr) are shown. There are other annotated regions in the pipeline output such as exonic and intronic CpG islands, but these are not shown.
meanMethyl1 refers to the mean methylation level in tumor.
meanMethyl2 refers to the mean methylation level in normal.
length refers to the length of the DMR.
nCG refers to then number of CpG sites in the DMR. By default the workflow requires at least 50 CpG sites in any DMR region.
areaStat refers to the area statistic of the DMR. The larger the area statistic, the more significant the DMR is. annot.X columns are produced by annotatr and all upper-case columns are extracted from IntOGen Compendium of Cancer Genes TSV file.
Expand
Table of DMRs overlapping with promoters of IntOGen CGC genes