Public Health Bioinformatics v1.1.0 Release Notes

This minor release introduces two new workflows, changes the outputs for the ONT workflows, and resolves various bugs.

New workflows:

Terra_2_GISAID
This workflow will submit concatenated metadata and assembly files to GISAID directly from Terra. The user must obtain a GISAID client-id before they can use this workflow.
Usher_PHB
This workflow will place your samples onto the most up-to-date versions of the UCSC's UShER phylogenetic trees and return subtree(s) that the user can visualize.

Major output changes in TheiaCoV_ONT and TheiaProk_ONT workflows

We identified an issue when using cg_pipeline in our ONT workflows that led to inaccurate QC metrics. We have corrected this issue by deprecating the use of cg_pipeline in all ONT workflows. QC metrics are now calculated using nanoplot, which is a tool geared specifically for ONT data. In addition, since fastq-scan is now redundant in these workflows, it has been removed.

Also, the maximum read length in TheiaProk_ONT was previously set to 10,000 base pairs. We have increased this to 100,000 base pairs by default.

TheiaProk_ONT New Outputs
The following columns are new.
- nanoplot_num_reads_clean1
- nanoplot_num_reads_raw1
- nanoplot_r1_mean_q_clean
- nanoplot_r1_mean_q_raw
- nanoplot_r1_mean_readlength_clean
- nanoplot_r1_mean_readlength_raw
- nanoplot_tsv_clean
- nanoplot_tsv_raw
- nanoplot_version
- nanoplot_docker
- nanoplot_html_clean
- nanoplot_html_raw
The following variables are now generated using nanoplot:
- est_coverage_raw
- est_coverage_clean
The following variables have been removed:
- num_reads_clean1
- num_reads_raw1
- r1_mean_q_raw
- r1_mean_readlength_raw
- fastq_scan_version
TheiaCoV_ONT New Outputs
The following columns are new.
- nanoplot_tsv_clean
- nanoplot_tsv_raw
- nanoplot_version
- nanoplot_docker
- nanoplot_html_clean
- nanoplot_html_raw
- est_coverage_raw
- est_coverage_clean
- r1_mean_readlength_clean
- r1_mean_readlength_raw
- r1_mean_q_clean
- r1_mean_q_raw
The following variables are now generated using nanoplot:
- num_reads_clean1
- num_reads_raw1
The following variables have been removed:
- fastq_scan_version

Bug Fixes

Corrected an inaccurate file extension in the augur workflow.
Adjusted several files to meet the style guide
Adjusted the default value for the core_genome input in Snippy_Tree to be true.
Fixed a bug in the summarize_data task
Fixed a bug and added new outputs in the SRA_Fetch workflow
Enabled the skipping of extra header columns in the Concatenate_Column_Content workflow
Added the .gfa file from Dragonflye as output
Updated default docker images and dataset tags for the Pangolin and Nextclade tasks.
Updated the GAMBIT database to v1.1.0
The GAMBIT docker image has been updated to use the latest GAMBIT version
Fixed a bug in file name parsing in the Lyve_Set_PHB workflow
Skipped the genome size estimation in the read_screen task for all ONT workflows.

What's Changed

update default docker for busco to GAR docker image by @kapsakcj in #132
change file extension by @sage-wright in #134
minor mashtree improvements by @kapsakcj in #142
[TheiaProk] expose kleborate_virulence_score and kleborate_resistance_score by @cimendes in #146
Explode workflows by @sage-wright in #135
Usher_PHB by @sage-wright in #149
Snippy_Tree core_genome default value by @sage-wright in #144
summarize_data task bug fix: -z bash conditional by @kapsakcj in #153
SRA_fetch workflow & fastq-dl task improvements by @kapsakcj in #150
Terra_2_GISAID by @sage-wright in #148
Skip extra headers in Concatenate_Column_Content by @sage-wright in #162
Deprecate the use of cg_pipeline for nanoplot stats by @cimendes in #164
Update defaults by @sage-wright in #171
update default gambit docker by @sage-wright in #173
lyveset fastq file parsing bugfix and other improvements by @kapsakcj in #156
update lyveSET FASTQ parsing by @kapsakcj in #177

Full Changelog: v1.0.1...v1.1.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.1.0