v1.1.0
Public Health Bioinformatics v1.1.0 Release Notes
This minor release introduces two new workflows, changes the outputs for the ONT workflows, and resolves various bugs.
New workflows:
-
Terra_2_GISAID
This workflow will submit concatenated metadata and assembly files to GISAID directly from Terra. The user must obtain a GISAID client-id before they can use this workflow. -
Usher_PHB
This workflow will place your samples onto the most up-to-date versions of the UCSC's UShER phylogenetic trees and return subtree(s) that the user can visualize.
Major output changes in TheiaCoV_ONT and TheiaProk_ONT workflows
We identified an issue when using cg_pipeline
in our ONT workflows that led to inaccurate QC metrics. We have corrected this issue by deprecating the use of cg_pipeline
in all ONT workflows. QC metrics are now calculated using nanoplot
, which is a tool geared specifically for ONT data. In addition, since fastq-scan
is now redundant in these workflows, it has been removed.
Also, the maximum read length in TheiaProk_ONT was previously set to 10,000 base pairs. We have increased this to 100,000 base pairs by default.
-
TheiaProk_ONT New Outputs
The following columns are new.nanoplot_num_reads_clean1
nanoplot_num_reads_raw1
nanoplot_r1_mean_q_clean
nanoplot_r1_mean_q_raw
nanoplot_r1_mean_readlength_clean
nanoplot_r1_mean_readlength_raw
nanoplot_tsv_clean
nanoplot_tsv_raw
nanoplot_version
nanoplot_docker
nanoplot_html_clean
nanoplot_html_raw
The following variables are now generated using
nanoplot
:est_coverage_raw
est_coverage_clean
The following variables have been removed:
num_reads_clean1
num_reads_raw1
r1_mean_q_raw
r1_mean_readlength_raw
fastq_scan_version
-
TheiaCoV_ONT New Outputs
The following columns are new.nanoplot_tsv_clean
nanoplot_tsv_raw
nanoplot_version
nanoplot_docker
nanoplot_html_clean
nanoplot_html_raw
est_coverage_raw
est_coverage_clean
r1_mean_readlength_clean
r1_mean_readlength_raw
r1_mean_q_clean
r1_mean_q_raw
The following variables are now generated using
nanoplot
:num_reads_clean1
num_reads_raw1
The following variables have been removed:
fastq_scan_version
Bug Fixes
- Corrected an inaccurate file extension in the
augur
workflow. - Adjusted several files to meet the style guide
- Adjusted the default value for the
core_genome
input in Snippy_Tree to betrue
. - Fixed a bug in the
summarize_data
task - Fixed a bug and added new outputs in the
SRA_Fetch
workflow - Enabled the skipping of extra header columns in the
Concatenate_Column_Content
workflow - Added the
.gfa
file from Dragonflye as output - Updated default docker images and dataset tags for the Pangolin and Nextclade tasks.
- Updated the GAMBIT database to v1.1.0
- The GAMBIT docker image has been updated to use the latest GAMBIT version
- Fixed a bug in file name parsing in the Lyve_Set_PHB workflow
- Skipped the genome size estimation in the
read_screen
task for all ONT workflows.
What's Changed
- update default docker for busco to GAR docker image by @kapsakcj in #132
- change file extension by @sage-wright in #134
- minor mashtree improvements by @kapsakcj in #142
- [TheiaProk] expose kleborate_virulence_score and kleborate_resistance_score by @cimendes in #146
- Explode workflows by @sage-wright in #135
- Usher_PHB by @sage-wright in #149
- Snippy_Tree
core_genome
default value by @sage-wright in #144 - summarize_data task bug fix: -z bash conditional by @kapsakcj in #153
- SRA_fetch workflow &
fastq-dl
task improvements by @kapsakcj in #150 - Terra_2_GISAID by @sage-wright in #148
- Skip extra headers in Concatenate_Column_Content by @sage-wright in #162
- Deprecate the use of cg_pipeline for nanoplot stats by @cimendes in #164
- Update defaults by @sage-wright in #171
- update default gambit docker by @sage-wright in #173
- lyveset fastq file parsing bugfix and other improvements by @kapsakcj in #156
- update lyveSET FASTQ parsing by @kapsakcj in #177
Full Changelog: v1.0.1...v1.1.0