Skip to content

Commit

Permalink
Merge pull request nf-core#644 from nf-core/samtools_view
Browse files Browse the repository at this point in the history
Add option to restrict analysis to specific contigs
  • Loading branch information
ramprasadn authored Nov 11, 2024
2 parents 9519733 + c98d063 commit afea4ff
Show file tree
Hide file tree
Showing 9 changed files with 68 additions and 14 deletions.
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### `Added`

- A new analysis option `mito` to call and annotate only mitochondrial variants [#608](https://github.com/nf-core/raredisease/pull/608)
- An option to restrict analysis to specific contigs [#644](https://github.com/nf-core/raredisease/pull/644)

### `Changed`

Expand All @@ -28,6 +29,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Parameters

| Old parameter | New parameter |
| ------------- | ------------------- |
| | extract_alignments |
| | restrict_to_contigs |

### Tool updates

| Tool | Old version | New version |
Expand Down
5 changes: 5 additions & 0 deletions conf/modules/align_bwa_bwamem2_bwameme.config
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,11 @@ process {
ext.prefix = { "${meta.id}_sorted_merged" }
}

withName: '.*ALIGN:ALIGN_BWA_BWAMEM2_BWAMEME:EXTRACT_ALIGNMENTS' {
ext.prefix = { "${meta.id}_sorted_merged_extracted" }
ext.args2 = { params.restrict_to_contigs }
}

withName: '.*ALIGN:ALIGN_BWA_BWAMEM2_BWAMEME:MARKDUPLICATES' {
ext.args = "--TMP_DIR ."
ext.prefix = { "${meta.id}_sorted_md" }
Expand Down
5 changes: 5 additions & 0 deletions conf/modules/align_sentieon.config
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,11 @@ process {
ext.prefix = { "${meta.id}_merged.bam" }
}

withName: '.*ALIGN:ALIGN_SENTIEON:EXTRACT_ALIGNMENTS' {
ext.prefix = { "${meta.id}_merged_extracted" }
ext.args2 = { params.restrict_to_contigs }
}

withName: '.*ALIGN:ALIGN_SENTIEON:SENTIEON_DEDUP' {
ext.args4 = { $params.rmdup ? "--rmdup" : '' }
ext.prefix = { "${meta.id}_dedup.bam" }
Expand Down
21 changes: 12 additions & 9 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -168,22 +168,25 @@ The mandatory and optional parameters for each category are tabulated below.

##### 1. Alignment

| Mandatory | Optional |
| ------------------------------ | ------------------------------ |
| aligner<sup>1</sup> | fasta_fai<sup>4</sup> |
| fasta<sup>2</sup> | bwamem2<sup>4</sup> |
| platform | bwa<sup>4</sup> |
| mito_name/mt_fasta<sup>3</sup> | bwameme<sup>4</sup> |
| | known_dbsnp<sup>5</sup> |
| | known_dbsnp_tbi<sup>5</sup> |
| | min_trimmed_length<sup>6</sup> |
| Mandatory | Optional |
| ------------------------------ | ------------------------------- |
| aligner<sup>1</sup> | fasta_fai<sup>4</sup> |
| fasta<sup>2</sup> | bwamem2<sup>4</sup> |
| platform | bwa<sup>4</sup> |
| mito_name/mt_fasta<sup>3</sup> | bwameme<sup>4</sup> |
| | known_dbsnp<sup>5</sup> |
| | known_dbsnp_tbi<sup>5</sup> |
| | min_trimmed_length<sup>6</sup> |
| | extract_alignments |
| | restrict_to_contigs<sup>7</sup> |

<sup>1</sup>Default value is bwamem2. Other alternatives are bwa, bwameme and sentieon (requires valid Sentieon license ).<br />
<sup>2</sup>Analysis set reference genome in fasta format, first 25 contigs need to be chromosome 1-22, X, Y and the mitochondria.<br />
<sup>3</sup>If mito_name is provided, mt_fasta can be generated by the pipeline.<br />
<sup>4</sup>fasta_fai, bwa, bwamem2 and bwameme, if not provided by the user, will be generated by the pipeline when necessary.<br />
<sup>5</sup>Used only by Sentieon.<br />
<sup>6</sup>Default value is 40. Used only by fastp.<br />
<sup>7</sup>Used to limit your analysis to specific contigs. Can be used to remove alignments to unplaced contigs to minimize potential errors. This parameter should be used in conjuction with `extract_alignments` parameter.<br />

##### 2. QC stats from the alignment files

Expand Down
2 changes: 2 additions & 0 deletions nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@ params {
analysis_type = 'wgs'
bwa_as_fallback = false
bait_padding = 100
extract_alignments = false
restrict_to_contigs = null
run_mt_for_wes = false
run_rtgvcfeval = false
save_mapped_as_cram = false
Expand Down
12 changes: 12 additions & 0 deletions nextflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -503,6 +503,13 @@
"help_text": "errorStrategy needs to be set to ignore for the bwamem2 process for the fallback to work. Turned off by default.",
"fa_icon": "fas fa-toggle-on"
},
"extract_alignments": {
"type": "boolean",
"default": "false",
"description": "After aligning the reads to a reference, extract alignments from specific regions/contigs and restrict the analysis to those regions/contigs.",
"help_text": "Set this to true, and specify the contig(s) using `restrict_to_contigs` parameter",
"fa_icon": "fas fa-toggle-on"
},
"platform": {
"type": "string",
"default": "illumina",
Expand All @@ -516,6 +523,11 @@
"fa_icon": "fas fa-align-center",
"enum": ["xy", "hetx", "sry"]
},
"restrict_to_contigs": {
"type": "string",
"description": "Can be specified as RNAME[:STARTPOS[-ENDPOS]]. Multiple regions should be seperated by space",
"fa_icon": "fas fa-align-center"
},
"run_mt_for_wes": {
"type": "boolean",
"description": "Specifies whether to run mitochondrial analysis for wes samples",
Expand Down
2 changes: 1 addition & 1 deletion subworkflows/local/align.nf
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ workflow ALIGN {
ch_bwamem2_bai = ALIGN_BWA_BWAMEM2_BWAMEME.out.marked_bai
ch_versions = ch_versions.mix(ALIGN_BWA_BWAMEM2_BWAMEME.out.versions)
} else if (params.aligner.equals("sentieon")) {
ALIGN_SENTIEON ( // Triggered when params.aligner is set as sentieon
ALIGN_SENTIEON ( // Triggered when params.aligner is set as sentieon
ch_reads,
ch_genome_fasta,
ch_genome_fai,
Expand Down
10 changes: 10 additions & 0 deletions subworkflows/local/alignment/align_bwa_bwamem2_bwameme.nf
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,11 @@ include { BWA_MEM as BWAMEM_FALLBACK } from '../../../modules/nf-c
include { BWAMEM2_MEM } from '../../../modules/nf-core/bwamem2/mem/main'
include { BWAMEME_MEM } from '../../../modules/nf-core/bwameme/mem/main'
include { SAMTOOLS_INDEX as SAMTOOLS_INDEX_ALIGN } from '../../../modules/nf-core/samtools/index/main'
include { SAMTOOLS_INDEX as SAMTOOLS_INDEX_EXTRACT } from '../../../modules/nf-core/samtools/index/main'
include { SAMTOOLS_INDEX as SAMTOOLS_INDEX_MARKDUP } from '../../../modules/nf-core/samtools/index/main'
include { SAMTOOLS_STATS } from '../../../modules/nf-core/samtools/stats/main'
include { SAMTOOLS_MERGE } from '../../../modules/nf-core/samtools/merge/main'
include { SAMTOOLS_VIEW as EXTRACT_ALIGNMENTS } from '../../../modules/nf-core/samtools/view/main'
include { PICARD_MARKDUPLICATES as MARKDUPLICATES } from '../../../modules/nf-core/picard/markduplicates/main'


Expand Down Expand Up @@ -82,6 +84,14 @@ workflow ALIGN_BWA_BWAMEM2_BWAMEME {
SAMTOOLS_MERGE ( bams.multiple, ch_genome_fasta, ch_genome_fai )
prepared_bam = bams.single.mix(SAMTOOLS_MERGE.out.bam)

// GET ALIGNMENT FROM SELECTED CONTIGS
if (params.extract_alignments) {
SAMTOOLS_INDEX_EXTRACT ( prepared_bam )
extract_bam_sorted_indexed = prepared_bam.join(SAMTOOLS_INDEX_EXTRACT.out.bai, failOnMismatch:true, failOnDuplicate:true)
EXTRACT_ALIGNMENTS( extract_bam_sorted_indexed, ch_genome_fasta, [])
prepared_bam = EXTRACT_ALIGNMENTS.out.bam
}

// Marking duplicates
MARKDUPLICATES ( prepared_bam , ch_genome_fasta, ch_genome_fai )
SAMTOOLS_INDEX_MARKDUP ( MARKDUPLICATES.out.bam )
Expand Down
19 changes: 15 additions & 4 deletions subworkflows/local/alignment/align_sentieon.nf
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,13 @@
// A subworkflow to annotate structural variants.
//

include { SENTIEON_BWAMEM } from '../../../modules/nf-core/sentieon/bwamem/main'
include { SENTIEON_DATAMETRICS } from '../../../modules/nf-core/sentieon/datametrics/main'
include { SENTIEON_DEDUP } from '../../../modules/nf-core/sentieon/dedup/main'
include { SENTIEON_READWRITER } from '../../../modules/nf-core/sentieon/readwriter/main'
include { SENTIEON_BWAMEM } from '../../../modules/nf-core/sentieon/bwamem/main'
include { SENTIEON_DATAMETRICS } from '../../../modules/nf-core/sentieon/datametrics/main'
include { SENTIEON_DEDUP } from '../../../modules/nf-core/sentieon/dedup/main'
include { SENTIEON_READWRITER } from '../../../modules/nf-core/sentieon/readwriter/main'
include { SAMTOOLS_VIEW as EXTRACT_ALIGNMENTS } from '../../../modules/nf-core/samtools/view/main'
include { SAMTOOLS_INDEX as SAMTOOLS_INDEX_EXTRACT } from '../../../modules/nf-core/samtools/index/main'

workflow ALIGN_SENTIEON {
take:
ch_reads_input // channel: [mandatory] [ val(meta), path(reads_input) ]
Expand Down Expand Up @@ -36,6 +39,14 @@ workflow ALIGN_SENTIEON {
SENTIEON_READWRITER ( merge_bams_in.multiple, ch_genome_fasta, ch_genome_fai )
ch_bam_bai = merge_bams_in.single.mix(SENTIEON_READWRITER.out.output_index)

// GET ALIGNMENT FROM SELECTED CONTIGS
if (params.extract_alignments) {
EXTRACT_ALIGNMENTS( ch_bam_bai, ch_genome_fasta, [])
ch_bam_bai = EXTRACT_ALIGNMENTS.out.bam
SAMTOOLS_INDEX_EXTRACT ( EXTRACT_ALIGNMENTS.out.bam )
ch_bam_bai = EXTRACT_ALIGNMENTS.out.bam.join(SAMTOOLS_INDEX_EXTRACT.out.bai, failOnMismatch:true, failOnDuplicate:true)
}

SENTIEON_DATAMETRICS ( ch_bam_bai, ch_genome_fasta, ch_genome_fai, false )

SENTIEON_DEDUP ( ch_bam_bai, ch_genome_fasta, ch_genome_fai )
Expand Down

0 comments on commit afea4ff

Please sign in to comment.