Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mapping tools and viral pathogen List #21

Merged
merged 25 commits into from
Jul 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
3b0328e
Add bowtie2 and minimap2 modules
LilyAnderssonLee Jun 18, 2024
182fa6a
ignore prettier check for viral_pathogen_List
LilyAnderssonLee Jun 18, 2024
70fa463
update the pathogen list
LilyAnderssonLee Jun 24, 2024
25241c0
Update .nf-core.yml
LilyAnderssonLee Jun 24, 2024
87791e4
Update nextflow_schema.json
LilyAnderssonLee Jun 24, 2024
2788b9e
add test_taxid.config and rename the pathogen list name
LilyAnderssonLee Jun 24, 2024
aab25e2
update the test.config
LilyAnderssonLee Jun 24, 2024
c065fe8
add a new line to viralPahogenList.csv
LilyAnderssonLee Jun 24, 2024
0ff8eed
screen pathogens for short reads and update moduels of extractcentrif…
LilyAnderssonLee Jul 10, 2024
59185ca
remove extra left-padding spaces in config/modules.config
LilyAnderssonLee Jul 10, 2024
003feba
add pathogen screening for long reads
LilyAnderssonLee Jul 10, 2024
2a7a354
update the CHANGELOG
LilyAnderssonLee Jul 11, 2024
17e72b6
Update conf/modules.config
LilyAnderssonLee Jul 18, 2024
484429b
Update conf/modules.config
LilyAnderssonLee Jul 18, 2024
22aa86a
Update conf/modules.config
LilyAnderssonLee Jul 18, 2024
6a5e515
Update conf/modules.config
LilyAnderssonLee Jul 18, 2024
3b912c7
Update nextflow_schema.json
LilyAnderssonLee Jul 18, 2024
b19db4f
Update nextflow_schema.json
LilyAnderssonLee Jul 18, 2024
7b1b179
Update nextflow_schema.json
LilyAnderssonLee Jul 18, 2024
c2cc381
Update nextflow_schema.json
LilyAnderssonLee Jul 18, 2024
eac76ca
Trail whitespace
LilyAnderssonLee Jul 18, 2024
bed5faa
move the multiqc to the end of modules.config
LilyAnderssonLee Jul 18, 2024
d16c390
Update nextflow_schema.json
LilyAnderssonLee Jul 18, 2024
36cc422
Update nextflow.config
LilyAnderssonLee Jul 18, 2024
9346870
Update nextflow_schema.json
LilyAnderssonLee Jul 18, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .nf-core.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ lint:
- assets/email_template.html
- assets/email_template.txt
- docs/README.md
- viralPathogenList.csv

multiqc_config:
- report_comment
Expand Down
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Extract Kraken2 reads with KrakenTools
- Extract Centrifuge reads
- Extract DIAMOND reads
- Screen pathogens via mapping against genomes from a list of pethogens: Bowtie2 for short reads and Minimap2 for long reads

### `Fixed`

Expand Down
102 changes: 102 additions & 0 deletions assets/test_data/reference/accession2taxid.map
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
NC_002205.1 518987
NC_002206.1 518987
NC_002207.1 518987
NC_002208.1 518987
NC_002209.1 518987
NC_002210.1 518987
NC_002211.1 518987
NC_002204.1 518987
NC_004910.1 130760
NC_004911.1 130760
NC_004912.1 130760
NC_004908.1 130760
NC_004905.2 130760
NC_004909.1 130760
NC_004907.1 130760
NC_004906.1 130760
NC_006307.2 11553
NC_006308.2 11553
NC_006309.2 11553
NC_006310.2 11553
NC_006311.1 11553
NC_006312.2 11553
NC_006306.2 11553
NC_007357.1 93838
NC_007358.1 93838
NC_007359.1 93838
NC_007362.1 93838
NC_007360.1 93838
NC_007361.1 93838
NC_007363.1 93838
NC_007364.1 93838
NC_007373.1 335341
NC_007372.1 335341
NC_007371.1 335341
NC_007366.1 335341
NC_007369.1 335341
NC_007368.1 335341
NC_007367.1 335341
NC_007370.1 335341
NC_002023.1 211044
NC_002021.1 211044
NC_002022.1 211044
NC_002017.1 211044
NC_002019.1 211044
NC_002018.1 211044
NC_002016.1 211044
NC_002020.1 211044
NC_007378.1 488241
NC_007375.1 488241
NC_007376.1 488241
NC_007374.1 488241
NC_007381.1 488241
NC_007382.1 488241
NC_007377.1 488241
NC_007380.1 488241
NC_026422.1 1332244
NC_026423.1 1332244
NC_026424.1 1332244
NC_026425.1 1332244
NC_026426.1 1332244
NC_026429.1 1332244
NC_026427.1 1332244
NC_026428.1 1332244
NC_026438.1 641809
NC_026435.1 641809
NC_026437.1 641809
NC_026433.1 641809
NC_026436.1 641809
NC_026434.1 641809
NC_026431.1 641809
NC_026432.1 641809
NC_036616.1 1173138
NC_036615.1 1173138
NC_036619.1 1173138
NC_036618.1 1173138
NC_036617.1 1173138
NC_036620.1 1173138
NC_036621.1 1173138
NC_060925.1 9606
NC_060926.1 9606
NC_060927.1 9606
NC_060928.1 9606
NC_060929.1 9606
NC_060930.1 9606
NC_060931.1 9606
NC_060932.1 9606
NC_060933.1 9606
NC_060934.1 9606
NC_060935.1 9606
NC_060936.1 9606
NC_060937.1 9606
NC_060938.1 9606
NC_060939.1 9606
NC_060940.1 9606
NC_060941.1 9606
NC_060942.1 9606
NC_060943.1 9606
NC_060944.1 9606
NC_060945.1 9606
NC_060946.1 9606
NC_060947.1 9606
NC_060948.1 9606
5,826 changes: 5,826 additions & 0 deletions assets/test_data/reference/reference.fna

Large diffs are not rendered by default.

69 changes: 64 additions & 5 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ process {
publishDir = [
path: { "${params.outdir}/fastqc/input" },
mode: params.publish_dir_mode,
pattern: '*.{html,zip}'
pattern: '*.{html}'
]
}

Expand All @@ -36,11 +36,12 @@ process {
pattern: '*viral_taxids.tsv'
]
}

withName: KRAKENTOOLS_EXTRACTKRAKENREADS {
ext.args = { params.fastq_output ? "--fastq-output" : ""}
ext.prefix = { "${meta.id}_${taxid.toString().replaceAll(' ', '-')}" }
publishDir = [
path: { "${params.outdir}/reads/kraken2" },
path: { "${params.outdir}/extracted_reads/kraken2" },
mode: params.publish_dir_mode,
pattern: '*.{fastq,fasta}'
]
Expand All @@ -49,26 +50,84 @@ process {
withName: EXTRACTCENTRIFUGEREADS {
ext.prefix = { "${meta.id}" }
publishDir = [
path: { "${params.outdir}/reads/centrifuge" },
path: { "${params.outdir}/extracted_reads/centrifuge" },
mode: params.publish_dir_mode,
pattern: '*.fastq'
]
}
LilyAnderssonLee marked this conversation as resolved.
Show resolved Hide resolved

withName: EXTRACTCDIAMONDREADS {
ext.prefix = { "${meta.id}" }
publishDir = [
path: { "${params.outdir}/reads/diamond" },
path: { "${params.outdir}/extracted_reads/diamond" },
mode: params.publish_dir_mode,
pattern: '*.fastq'
]
}

withName: '.*:FASTQ_ALIGN_BOWTIE2:BAM_SORT_STATS_SAMTOOLS:SAMTOOLS_(SORT|INDEX)' {
sofstam marked this conversation as resolved.
Show resolved Hide resolved
publishDir = [
path: { "$params.outdir/pathogens/bowtie2/align" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
}
LilyAnderssonLee marked this conversation as resolved.
Show resolved Hide resolved

withName: 'SAMTOOLS_SORT' {
ext.prefix = { "${meta.id}_aligned_pathogens_genome_sorted"}
}

withName: '.*:FASTQ_ALIGN_BOWTIE2:BAM_SORT_STATS_SAMTOOLS:BAM_STATS_SAMTOOLS:.*' {
publishDir = [ enabled: false ]
}

withName: 'BOWTIE2_BUILD_PATHOGEN' {
ext.args = '--large-index'
publishDir = [
path: { "$params.outdir/pathogens/bowtie2/build" },
mode: params.publish_dir_mode,
pattern: 'bowtie2'
]
}
LilyAnderssonLee marked this conversation as resolved.
Show resolved Hide resolved

withName: '.*:FASTQ_ALIGN_BOWTIE2:BOWTIE2_ALIGN' {
ext.args = '--no-unal'
publishDir = [ enabled: false]
}
LilyAnderssonLee marked this conversation as resolved.
Show resolved Hide resolved

withName: '.*:LONGREAD_SCREENPATHOGEN:MINIMAP2_INDEX' {
ext.args = '-x map-ont'
publishDir = [
path: { "$params.outdir/pathogens/minimap2/index" },
mode: params.publish_dir_mode,
pattern: '*.mmi'
]
}
sofstam marked this conversation as resolved.
Show resolved Hide resolved

withName: '.*:LONGREAD_SCREENPATHOGEN:MINIMAP2_ALIGN' {
ext.args = '--sam-hit-only'
publishDir = [ enabled: false]
}
LilyAnderssonLee marked this conversation as resolved.
Show resolved Hide resolved

withName: '.*:LONGREAD_SCREENPATHOGEN:BAM_SORT_STATS_SAMTOOLS:SAMTOOLS_(SORT|INDEX)' {
publishDir = [
path: { "$params.outdir/pathogens/minimap2/align" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
}
LilyAnderssonLee marked this conversation as resolved.
Show resolved Hide resolved

withName: '.*:LONGREAD_SCREENPATHOGEN:BAM_SORT_STATS_SAMTOOLS:BAM_STATS_SAMTOOLS:.*' {
publishDir = [ enabled: false ]
}

withName: MULTIQC {
ext.args = { { params.multiqc_title ? "--title \"$params.multiqc_title\"" : '' } }
publishDir = [
path: { "${params.outdir}/multiqc"},
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]

}

}
9 changes: 5 additions & 4 deletions conf/test.config
Original file line number Diff line number Diff line change
Expand Up @@ -20,18 +20,19 @@ params {
max_time = '6.h'

// Input data
//input = '../final_test_data/samplesheet_v3.csv'
input = 'assets/samplesheet_v1.csv'

taxid = '211044 11676' //separated by space
pathogens_genome = 'assets/test_data/reference/reference.fna'

// Extract reads
perform_extract_reads = true
extract_kraken2_reads = true
fastq_output = true

extract_centrifuge_reads = true
extract_diamond_reads = true

// Screen pathogens
perform_screen_pathogens = true

// Genome references
genome = 'R64-1-1'
}
15 changes: 12 additions & 3 deletions conf/test_full.config
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,18 @@ params {
config_profile_description = 'Full test dataset to check pipeline function'

// Input data for full size test
// TODO nf-core: Specify the paths to your full test data ( on nf-core/test-datasets or directly in repositories, e.g. SRA)
// TODO nf-core: Give any required params for the test so that command line flags are not needed
input = params.pipelines_testdata_base_path + 'viralrecon/samplesheet/samplesheet_full_illumina_amplicon.csv'
input = '../final_test_data/samplesheet_v3.csv'
pathogens_genome = 'assets/test_data/reference/reference.fna'

// Extract reads
perform_extract_reads = true
extract_kraken2_reads = true
fastq_output = true
extract_centrifuge_reads = true
extract_diamond_reads = true

// Screen pathogens
perform_screen_pathogens = true

// Genome references
genome = 'R64-1-1'
Expand Down
39 changes: 39 additions & 0 deletions conf/test_taxid.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Nextflow config file for running minimal tests
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Defines a input taxid list to run a fast and simple pipeline test.

Use as follows:
nextflow run genomic-medicine-sweden/meta-val -profile test_taxid,<docker/singularity> --outdir <OUTDIR>

----------------------------------------------------------------------------------------
*/

params {
config_profile_name = 'Test user defined taxid profile'
config_profile_description = 'Minimal test dataset to check pipeline function'

// Limit resources so that this can run on GitHub Actions
max_cpus = 2
max_memory = '6.GB'
max_time = '6.h'

// Input data
input = 'assets/samplesheet_v1.csv'
taxid = '211044 11676' //separated by space
pathogens_genome = 'assets/test_data/reference/reference.fna'

// Extract reads
perform_extract_reads = true
extract_kraken2_reads = true
fastq_output = true
extract_centrifuge_reads = true
extract_diamond_reads = true

// Screen pathogens
perform_screen_pathogens = false

// Genome references
genome = 'R64-1-1'
}
60 changes: 60 additions & 0 deletions modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,16 @@
"https://github.com/nf-core/modules.git": {
"modules": {
"nf-core": {
"bowtie2/align": {
"branch": "master",
"git_sha": "e4bad511789f16d0df39ee306b2cd50418365048",
"installed_by": ["fastq_align_bowtie2", "modules"]
},
"bowtie2/build": {
"branch": "master",
"git_sha": "1fea64f5132a813ec97c1c6d3a74e0aee7142b6d",
"installed_by": ["modules"]
},
"fastqc": {
"branch": "master",
"git_sha": "285a50500f9e02578d90b3ce6382ea3c30216acd",
Expand All @@ -15,15 +25,65 @@
"git_sha": "77decb880af3f0ea6d22b2383c3f1ed86aac4aa2",
"installed_by": ["modules"]
},
"minimap2/align": {
"branch": "master",
"git_sha": "e83b347b3e674de6bb1bb7bdf9b2674768c8e0fe",
"installed_by": ["modules"]
},
"minimap2/index": {
"branch": "master",
"git_sha": "72e277acfd9e61a9f1368eafb4a9e83f5bcaa9f5",
"installed_by": ["modules"]
},
"multiqc": {
"branch": "master",
"git_sha": "b7ebe95761cd389603f9cc0e0dc384c0f663815a",
"installed_by": ["modules"]
},
"samtools/flagstat": {
"branch": "master",
"git_sha": "04fbbc7c43cebc0b95d5b126f6d9fe4effa33519",
"installed_by": ["bam_stats_samtools"]
},
"samtools/idxstats": {
"branch": "master",
"git_sha": "04fbbc7c43cebc0b95d5b126f6d9fe4effa33519",
"installed_by": ["bam_stats_samtools"]
},
"samtools/index": {
"branch": "master",
"git_sha": "04fbbc7c43cebc0b95d5b126f6d9fe4effa33519",
"installed_by": ["bam_sort_stats_samtools"]
},
"samtools/sort": {
"branch": "master",
"git_sha": "04fbbc7c43cebc0b95d5b126f6d9fe4effa33519",
"installed_by": ["bam_sort_stats_samtools"]
},
"samtools/stats": {
"branch": "master",
"git_sha": "04fbbc7c43cebc0b95d5b126f6d9fe4effa33519",
"installed_by": ["bam_stats_samtools"]
}
}
},
"subworkflows": {
"nf-core": {
"bam_sort_stats_samtools": {
"branch": "master",
"git_sha": "0eacd714effe5aac1c1de26593873960b3346cab",
"installed_by": ["fastq_align_bowtie2"]
},
"bam_stats_samtools": {
"branch": "master",
"git_sha": "0eacd714effe5aac1c1de26593873960b3346cab",
"installed_by": ["bam_sort_stats_samtools"]
},
"fastq_align_bowtie2": {
"branch": "master",
"git_sha": "0eacd714effe5aac1c1de26593873960b3346cab",
"installed_by": ["subworkflows"]
},
"utils_nextflow_pipeline": {
"branch": "master",
"git_sha": "5caf7640a9ef1d18d765d55339be751bb0969dfa",
Expand Down
Loading