diff --git a/joss.05313/10.21105.joss.05313.crossref.xml b/joss.05313/10.21105.joss.05313.crossref.xml new file mode 100644 index 0000000000..be503f04ba --- /dev/null +++ b/joss.05313/10.21105.joss.05313.crossref.xml @@ -0,0 +1,284 @@ + + + + 20231023T102350-54b811bc7775e6e383e5c243d2550279a34f8fea + 20231023102349 + + JOSS Admin + admin@theoj.org + + The Open Journal + + + + + Journal of Open Source Software + JOSS + 2475-9066 + + 10.21105/joss + https://joss.theoj.org + + + + + 10 + 2023 + + + 8 + + 90 + + + + QuaC: A Pipeline Implementing Quality Control Best +Practices for Genome Sequencing and Exome Sequencing Data + + + + Manavalan + Gajapathy + https://orcid.org/0000-0002-8606-0113 + + + Brandon M. + Wilk + https://orcid.org/0000-0002-4110-2324 + + + Elizabeth A. + Worthey + https://orcid.org/0000-0003-4083-7764 + + + + 10 + 23 + 2023 + + + 5313 + + + 10.21105/joss.05313 + + + http://creativecommons.org/licenses/by/4.0/ + http://creativecommons.org/licenses/by/4.0/ + http://creativecommons.org/licenses/by/4.0/ + + + + Software archive + 10.5281/zenodo.10002036 + + + GitHub review issue + https://github.com/openjournals/joss-reviews/issues/5313 + + + + 10.21105/joss.05313 + https://joss.theoj.org/papers/10.21105/joss.05313 + + + https://joss.theoj.org/papers/10.21105/joss.05313.pdf + + + + + + MultiQC: Summarize analysis results for +multiple tools and samples in a single report + Ewels + Bioinformatics + 19 + 32 + 10.1093/bioinformatics/btw354 + 1367-4803 + 2016 + Ewels, P., Magnusson, M., Lundin, S., +& Käller, M. (2016). MultiQC: Summarize analysis results for +multiple tools and samples in a single report. Bioinformatics, 32(19), +3047–3048. +https://doi.org/10.1093/bioinformatics/btw354 + + + Commonalities across computational workflows +for uncovering explanatory variants in undiagnosed cases + Kobren + Genetics in Medicine + 6 + 23 + 10.1038/s41436-020-01084-8 + 1530-0366 + 2021 + Kobren, S. N., Baldridge, D., +Velinder, M., Krier, J. B., LeBlanc, K., Esteves, C., Pusey, B. N., +Züchner, S., Blue, E., Lee, H., Huang, A., Bastarache, L., Bican, A., +Cogan, J., Marwaha, S., Alkelai, A., Murdock, D. R., Liu, P., Wegner, D. +J., … Kohane, I. S. (2021). Commonalities across computational workflows +for uncovering explanatory variants in undiagnosed cases. Genetics in +Medicine, 23(6), 1075–1085. +https://doi.org/10.1038/s41436-020-01084-8 + + + Best practices for the analytical validation +of clinical whole-genome sequencing intended for the diagnosis of +germline disease + Marshall + npj Genomic Medicine + 1 + 5 + 10.1038/s41525-020-00154-9 + 2056-7944 + 2020 + Marshall, C. R., Chowdhury, S., Taft, +R. J., Lebo, M. S., Buchan, J. G., Harrison, S. M., Rowsey, R., Klee, E. +W., Liu, P., Worthey, E. A., Jobanputra, V., Dimmock, D., Kearney, H. +M., Bick, D., Kulkarni, S., Taylor, S. L., Belmont, J. W., Stavropoulos, +D. J., & Lennon, N. J. (2020). Best practices for the analytical +validation of clinical whole-genome sequencing intended for the +diagnosis of germline disease. Npj Genomic Medicine, 5(1), 1–12. +https://doi.org/10.1038/s41525-020-00154-9 + + + Qualimap 2: Advanced multi-sample quality +control for high-throughput sequencing data + Okonechnikov + Bioinformatics + 10.1093/bioinformatics/btv566 + 1367-4803 + 2015 + Okonechnikov, K., Conesa, A., & +García-Alcalde, F. (2015). Qualimap 2: Advanced multi-sample quality +control for high-throughput sequencing data. Bioinformatics, btv566. +https://doi.org/10.1093/bioinformatics/btv566 + + + Picard toolkit + Picard toolkit. (n.d.). Broad +Institute. +https://github.com/broadinstitute/picard + + + Mosdepth: Quick coverage calculation for +genomes and exomes + Pedersen + Bioinformatics + 5 + 34 + 10.1093/bioinformatics/btx699 + 1367-4803 + 2018 + Pedersen, B. S., & Quinlan, A. R. +(2018). Mosdepth: Quick coverage calculation for genomes and exomes. +Bioinformatics, 34(5), 867–868. +https://doi.org/10.1093/bioinformatics/btx699 + + + Indexcov: Fast coverage quality control for +whole-genome sequencing + Pedersen + GigaScience + 11 + 6 + 10.1093/gigascience/gix090 + 2047-217X + 2017 + Pedersen, B. S., Collins, R. L., +Talkowski, M. E., & Quinlan, A. R. (2017). Indexcov: Fast coverage +quality control for whole-genome sequencing. GigaScience, 6(11). +https://doi.org/10.1093/gigascience/gix090 + + + Covviz + Covviz. (n.d.). +https://github.com/brwnj/covviz + + + Twelve years of SAMtools and +BCFtools + Danecek + GigaScience + 2 + 10 + 10.1093/gigascience/giab008 + 2047-217X + 2021 + Danecek, P., Bonfield, J. K., Liddle, +J., Marshall, J., Ohan, V., Pollard, M. O., Whitwham, A., Keane, T., +McCarthy, S. A., Davies, R. M., & Li, H. (2021). Twelve years of +SAMtools and BCFtools. GigaScience, 10(2), giab008. +https://doi.org/10.1093/gigascience/giab008 + + + Ancestry-agnostic estimation of DNA sample +contamination from sequence reads + Zhang + Genome Research + 2 + 30 + 10.1101/gr.246934.118 + 1088-9051 + 2020 + Zhang, F., Flickinger, M., Taliun, S. +A. G., InPSYght Psychiatric Genetics Consortium, Abecasis, G. R., Scott, +L. J., McCaroll, S. A., Pato, C. N., Boehnke, M., & Kang, H. M. +(2020). Ancestry-agnostic estimation of DNA sample contamination from +sequence reads. Genome Research, 30(2), 185–194. +https://doi.org/10.1101/gr.246934.118 + + + Somalier: Rapid relatedness estimation for +cancer and germline studies using efficient genome +sketches + Pedersen + Genome Medicine + 1 + 12 + 10.1186/s13073-020-00761-2 + 1756-994X + 2020 + Pedersen, B. S., Bhetariya, P. J., +Brown, J., Kravitz, S. N., Marth, G., Jensen, R. L., Bronner, M. P., +Underhill, H. R., & Quinlan, A. R. (2020). Somalier: Rapid +relatedness estimation for cancer and germline studies using efficient +genome sketches. Genome Medicine, 12(1), 62. +https://doi.org/10.1186/s13073-020-00761-2 + + + FastQ Screen: A tool for multi-genome mapping +and quality control + Wingett + F1000Research + 7 + 10.12688/f1000research.15931.2 + 2046-1402 + 2018 + Wingett, S. W., & Andrews, S. +(2018). FastQ Screen: A tool for multi-genome mapping and quality +control. F1000Research, 7, 1338. +https://doi.org/10.12688/f1000research.15931.2 + + + FastQC + Andrews + 2012 + Andrews, S., Krueger, F., +Segonds-Pichon, A., Biggins, L., Krueger, C., & Wingett, S. (2012). +FastQC. + + + + + + diff --git a/joss.05313/10.21105.joss.05313.jats b/joss.05313/10.21105.joss.05313.jats new file mode 100644 index 0000000000..9047157dc8 --- /dev/null +++ b/joss.05313/10.21105.joss.05313.jats @@ -0,0 +1,775 @@ + + +
+ + + + +Journal of Open Source Software +JOSS + +2475-9066 + +Open Journals + + + +5313 +10.21105/joss.05313 + +QuaC: A Pipeline Implementing Quality Control Best +Practices for Genome Sequencing and Exome Sequencing +Data + + + +https://orcid.org/0000-0002-8606-0113 + +Gajapathy +Manavalan + + + +* + + +https://orcid.org/0000-0002-4110-2324 + +Wilk +Brandon M. + + + + + +https://orcid.org/0000-0003-4083-7764 + +Worthey +Elizabeth A. + + + +* + + + +Center for Computational Genomics and Data Science, The +University of Alabama at Birmingham, Birmingham, Alabama, United States +of America + + + + +Department of Genetics, Heersink School of Medicine, The +University of Alabama at Birmingham, Birmingham, Alabama, United States +of America + + + + +* E-mail: +* E-mail: + + +15 +1 +2023 + +8 +90 +5313 + +Authors of papers retain copyright and release the +work under a Creative Commons Attribution 4.0 International License (CC +BY 4.0) +2022 +The article authors + +Authors of papers retain copyright and release the work under +a Creative Commons Attribution 4.0 International License (CC BY +4.0) + + + +snakemake +Python +quality control +genome sequencing +exome sequencing +QC review +multiqc +singularity +bam +vcf + + + + + + Summary +

Quality Control (QC) of human genome sequencing and exome + sequencing data is necessary to ensure they are of sufficient quality + for downstream analyses. While several QC tools are available to + measure quality parameters at various levels post-sequencing, their + output needs to be reviewed and interpreted in a very manual and + time-consuming process. Such manual review is a major challenge + towards standardization and consistency, as the process can be + subjective depending on the reviewer. To address these difficulties, + we have developed QuaC, which implements, integrates, and standardizes + QC best practices at our Center. It performs three major steps: (1) + runs several QC tools using data produced by the read alignment (BAM) + and small variant calling (VCF) as input and optionally accepts QC + output for raw sequencing reads (FASTQ); (2) executes QuaC-Watch to + perform QC checkup based on the expected thresholds for quality + metrics; and (3) aggregates QC metrics produced by all the QC tools as + well as QuaC-Watch results into single, self-contained MultiQC report, + both at the per-sample and across-project levels. This report provides + aggregate summaries for all samples within a project/cohort for + efficient comprehensive review while still allowing for granular + review down to individual metrics for a single sample. Finally, we + have developed a “Sample QC review system” schema to standardize QC + reviewer’s logging of results and simplify downstream users’ + interpretation of the reviewers finding.

+
+ + Statement of need +

Application of Genome sequencing (GS) and exome sequencing (ES) + based approaches has increased dramatically for both research and + clinical purposes over the last decade. Several quality control (QC) + tools have become available to help ensure that sequenced reads meet + expected measures of quality, and to identify process related errors + such as sample swaps or contamination. In recent years, efforts have + been made to define QC metrics and acceptable thresholds for QC + standardization across research groups + (Kobren + et al., 2021; + Marshall + et al., 2020). Despite these advances, integrating QC output + from multiple tools, performing QC review in a standardized manner, + and logging QC review results in an accessible and easy-to-understand + manner to inform downstream consumers of the data remains a burden. + Lack of defined procedures and appropriate shareable outputs for the + latter step can result in downstream consumers proceeding unaware of + QC issues. Without these outputs, downstream consumers often + re-generate QC metrics, at times with limited expertise, wasting time + and effort. Here, we present QuaC, a pipeline that integrates several + QC tools and summarizes QC metrics for GS and ES samples using + pre-defined and user-configurable thresholds to highlight potentially + problematic samples. Further, we provide a system for interpretation + of QC metrics called the “Sample QC Review System”, which supports + recording of QC review results in a standardized manner.

+ + Quac Development +

QuaC is a configurable pipeline developed using Snakemake and + Python. QuaC provides a command-line interface (CLI), written in + Python, to support user input, configuration, and execution. + System-level tests along with mock data and example input + configuration files are included in QuaC to assert correct operation + after install and test future developments. Unit jobs triggered by + QuaC are executed in Singularity container environment, as such + setup provides the major advantage of reproducibility and + portability across various user environments. QuaC is run at the + project level, and samples in the project are provided as input in a + pedigree file format (.ped), where sample + metadata such as sample relatedness and sex can be optionally + provided.

+
+ + QC Tools Utilized +

QuaC runs several QC tools + (Table 1) using BAM + and VCF files as input. These support identification of sequencing, + alignment, and variant calling related issues, within-species + contamination, and sample swaps or incorrectly stated relationships + between samples based on sex, ancestry, and relatedness estimations. + Besides these tools, QuaC can optionally consume output from three + QC tools executed separate from QuaC: FastQC to check quality of raw + sequence reads + (Andrews + et al., 2012), FastQ Screen to check for cross-species + contamination using raw sequence reads + (Wingett + & Andrews, 2018), and Picard-MarkDuplicates to check for + read duplication in BAM files + (Picard + Toolkit, n.d.). While QuaC cannot run these QC + tools, it can utilize their output as part of QC metric aggregation + and summarization.

+ + +

QC tools used in QuaC. Note that this list does not include + tools that QuaC can consume when run with + --include_prior_qc flag. +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ToolUsage in QuaCQC type
Qualimap + (Okonechnikov + et al., 2015)Summarizes several alignment metrics using BAM fileBAM quality
Picard-CollectMultipleMetrics + (Picard + Toolkit, n.d.)Summarizes alignment metrics from BAM file using several + modulesBAM quality
Picard-CollectWgsMetrics + (Picard + Toolkit, n.d.)Collects metrics about coverage and performance using + BAM fileBAM quality
Mosdepth + (Brent + S. Pedersen & Quinlan, 2018)Fast alignment depth calculation using BAM fileBAM quality
Indexcov + (Brent + S. Pedersen et al., 2017)Estimate coverage from BAM index for GS (Skipped in + exome mode)BAM quality
Covviz + (Covviz, + n.d.)Identifies large, coverage-based anomalies for GS using + Indexcov output (Skipped in exome mode)BAM quality
Bcftools stats + (Danecek + et al., 2021)Summarizes VCF file statsVCF quality
VerifyBamID2 + (Zhang + et al., 2020)Estimates within-species (i.e., cross-sample) + contamination using BAM fileWithin-species contamination
Somalier + (Brent + S. Pedersen et al., 2020)Estimation of sex, ancestry and relatedness using BAM + fileSex, ancestry, and relatedness estimation
+
+
+ + QC Checkup Using QuaC-Watch +

QuaC includes a tool called QuaC-Watch, which consumes results + from the above-mentioned QC tools, compares QC metrics against the + acceptable thresholds, and summarizes results using color-coded + pass/fail flags for efficient review + ([fig:multiqc]). + This summary allows users to quickly review output from multiple QC + tools, identify whether samples meet expected quality thresholds, + and readily highlight samples that need further review. Reasonable + default thresholds for QC metrics have been carefully selected and + built in to QuaC-Watch. These are applicable for most GS and ES but + are also configurable by the user. QC metrics and thresholds were + curated based on literature + (Kobren + et al., 2021; + Marshall + et al., 2020), in-house analyses using many hundreds of both + GS and ES samples, and knowledge gained from our past experiences. + Integration of QC metrics and associated thresholds into QuaC not + only assists with standardization of our internal QC review process, + but also supports review and reusability between groups. We believe + release of this information provides utility to the community. To + our knowledge, this type of curated collection spanning an + integrated suite of tools has not been made available + previously.

+
+ + QC Aggregation +

To minimize the time needed to review QC metrics and assess + quality of samples across a project QuaC aggregates results produced + by all the QC tools and QuaC-Watch, using MultiQC + (Ewels + et al., 2016), into per-sample and across-project stand-alone + interactive HTML reports. The QuaC-Watch summary is presented as the + first section of the report for initial review, followed by + individual QC tool outputs for deeper review of metrics where + high-level findings warrant it + ([fig:multiqc]). + Availability of MultiQC reports at both sample and project level + enables easier review and distribution of QC results internally as + well as with external collaborators.

+ +

Aggregation and visualization of QC tools output and + QuaC-Watch output using MultiQC at the project level. QuaC-Watch + section shown here enables quick review of samples’ QC results and + helps to quickly identify samples that need further review. Users + may optionally toggle columns to view values for QC metrics of + interest and hover over the column title to view thresholds used + by QuaC-Watch (highlighted by red arrow). In addition to this + project-level report, similar MultiQC report is created at the + single-sample level for all the samples, which shows summarized QC + results for only one + sample..

+ +
+
+ + QC Review Process +

Consistent and understandable dissemination of QC review results + can be challenging when quality issues are identified, and even more + so when these issues hamper accurate downstream analyses or + interpretation. To reduce this burden, we devised a “Sample QC + review system” where QC review results are flagged as pass, + acceptable, poor, and fail, along with a free text field for review + comments (Table 2). + This system allows data consumers to rapidly review for sample + issues and also points them to the known or likely cause of the + issue. Since not all QC issues are catastrophic, this aids in rapid + determination as to whether specific samples can be used for + intended purposes. As not all users are proficient in interpreting + results from the various QC tools, this system has proven helpful in + enabling assessment and ensuring the quality of the conclusions + based on this data.

+ + +

Fields logged in Sample QC database using controlled flags. + Type 1 flags are pass, acceptable, poor, and fail. Type 2 flags + are pass, fail, and not applicable. +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
FieldExplanationAllowed values
Sample - Overall StatusOverall QC status considering results of all QC + performedType 1 flags
FASTQOverall QC status considering results of all QC + performed at FASTQ levelType 1 flags
FASTQ CommentComments on QC at FASTQ level (e.g., small insert size, + high adapter content, etc.)Free text
BAMOverall QC status considering results of all QC + performed at BAM levelType 1 flags
BAM CommentComments on QC at BAM level (e.g., low mean coverage, + high duplication rate, etc.)Free text
VCFOverall QC status considering results of all QC + performed at VCF levelType 1 flags
VCF CommentComments on QC at VCF level (e.g., small insert size, + high adapter content, etc.)Free text
Other Species ContaminationSample contamination status due to other species’ + genomic materialType 1 flags
Human Cross-contaminationSample contamination status due to other human’s genomic + materialType 1 flags
Sex CheckDid the predicted sex match the expected sex?Type 2 flags
Relatedness CheckDid the predicted relatedness match expected + relatedness?Type 2 flags
Ancestry CheckDid the predicted ancestry match expected ancestry?Type 2 flags
Other Comments/NotesAny other comments/notes concerning QCFree text
+
+
+ + Source Code and Documentation +

Source code for QuaC is available for download at + https://github.com/uab-cgds-worthey/quac under GNU GPLv3 license. + Installation, setup, configuration, and usage documentation is + available at https://quac.readthedocs.io.

+
+
+ + Acknowledgements + + +

We would like to thank Donna Brown for providing feedback on + the utility of QuaC-Watch in research projects.

+
+ +

This work was supported in part by an award from the CF + Foundation to Dr. Worthey (WORTHE19A0) and from UAB SOM Start-up + funds to Dr. Worthey.

+
+
+
+ + + + + + + EwelsPhilip + MagnussonMåns + LundinSverker + KällerMax + + MultiQC: Summarize analysis results for multiple tools and samples in a single report + Bioinformatics + 201610 + 20230111 + 32 + 19 + 1367-4803 + https://academic.oup.com/bioinformatics/article-lookup/doi/10.1093/bioinformatics/btw354 + 10.1093/bioinformatics/btw354 + 3047 + 3048 + + + + + + KobrenShilpa Nadimpalli + BaldridgeDustin + VelinderMatt + KrierJoel B. + LeBlancKimberly + EstevesCecilia + PuseyBarbara N. + ZüchnerStephan + BlueElizabeth + LeeHane + HuangAlden + BastaracheLisa + BicanAnna + CoganJoy + MarwahaShruti + AlkelaiAnna + MurdockDavid R. + LiuPengfei + WegnerDaniel J. + PaulAlexander J. + SunyaevShamil R. + KohaneIsaac S. + + Commonalities across computational workflows for uncovering explanatory variants in undiagnosed cases + Genetics in Medicine + 202106 + 20230111 + 23 + 6 + 1530-0366 + https://www.nature.com/articles/s41436-020-01084-8 + 10.1038/s41436-020-01084-8 + 1075 + 1085 + + + + + + MarshallChristian R. + ChowdhuryShimul + TaftRyan J. + LeboMathew S. + BuchanJillian G. + HarrisonSteven M. + RowseyRoss + KleeEric W. + LiuPengfei + WortheyElizabeth A. + JobanputraVaidehi + DimmockDavid + KearneyHutton M. + BickDavid + KulkarniShashikant + TaylorStacie L. + BelmontJohn W. + StavropoulosDimitri J. + LennonNiall J. + + Best practices for the analytical validation of clinical whole-genome sequencing intended for the diagnosis of germline disease + npj Genomic Medicine + 202010 + 20230111 + 5 + 1 + 2056-7944 + https://www.nature.com/articles/s41525-020-00154-9 + 10.1038/s41525-020-00154-9 + 1 + 12 + + + + + + OkonechnikovKonstantin + ConesaAna + García-AlcaldeFernando + + Qualimap 2: Advanced multi-sample quality control for high-throughput sequencing data + Bioinformatics + 201510 + 20230111 + 1367-4803 + https://academic.oup.com/bioinformatics/article-lookup/doi/10.1093/bioinformatics/btv566 + 10.1093/bioinformatics/btv566 + btv566 + + + + + + Picard toolkit + Broad Institute + https://github.com/broadinstitute/picard + + + + + + PedersenBrent S + QuinlanAaron R + + Mosdepth: Quick coverage calculation for genomes and exomes + Bioinformatics + + HancockJohn + + 201803 + 20230111 + 34 + 5 + 1367-4803 + https://academic.oup.com/bioinformatics/article/34/5/867/4583630 + 10.1093/bioinformatics/btx699 + 867 + 868 + + + + + + PedersenBrent S + CollinsRyan L + TalkowskiMichael E + QuinlanAaron R + + Indexcov: Fast coverage quality control for whole-genome sequencing + GigaScience + 201711 + 20230111 + 6 + 11 + 2047-217X + https://academic.oup.com/gigascience/article/doi/10.1093/gigascience/gix090/4160383 + 10.1093/gigascience/gix090 + + + + + Covviz + https://github.com/brwnj/covviz + + + + + + DanecekPetr + BonfieldJames K + LiddleJennifer + MarshallJohn + OhanValeriu + PollardMartin O + WhitwhamAndrew + KeaneThomas + McCarthyShane A + DaviesRobert M + LiHeng + + Twelve years of SAMtools and BCFtools + GigaScience + 202101 + 20230111 + 10 + 2 + 2047-217X + https://academic.oup.com/gigascience/article/doi/10.1093/gigascience/giab008/6137722 + 10.1093/gigascience/giab008 + giab008 + + + + + + + ZhangFan + FlickingerMatthew + TaliunSarah A. Gagliano + InPSYght Psychiatric Genetics Consortium + AbecasisGonçalo R. + ScottLaura J. + McCarollSteven A. + PatoCarlos N. + BoehnkeMichael + KangHyun Min + + Ancestry-agnostic estimation of DNA sample contamination from sequence reads + Genome Research + 202002 + 20230111 + 30 + 2 + 1088-9051 + http://genome.cshlp.org/lookup/doi/10.1101/gr.246934.118 + 10.1101/gr.246934.118 + 185 + 194 + + + + + + PedersenBrent S. + BhetariyaPreetida J. + BrownJoe + KravitzStephanie N. + MarthGabor + JensenRandy L. + BronnerMary P. + UnderhillHunter R. + QuinlanAaron R. + + Somalier: Rapid relatedness estimation for cancer and germline studies using efficient genome sketches + Genome Medicine + 202012 + 20230111 + 12 + 1 + 1756-994X + https://genomemedicine.biomedcentral.com/articles/10.1186/s13073-020-00761-2 + 10.1186/s13073-020-00761-2 + 62 + + + + + + + WingettSteven W. + AndrewsSimon + + FastQ Screen: A tool for multi-genome mapping and quality control + F1000Research + 201809 + 20230112 + 7 + 2046-1402 + https://f1000research.com/articles/7-1338/v2 + 10.12688/f1000research.15931.2 + 1338 + + + + + + + AndrewsSimon + KruegerFelix + Segonds-PichonAnne + BigginsLaura + KruegerChristel + WingettSteven + + FastQC + 201201 + + + + +
diff --git a/joss.05313/10.21105.joss.05313.pdf b/joss.05313/10.21105.joss.05313.pdf new file mode 100644 index 0000000000..37dbd3deb1 Binary files /dev/null and b/joss.05313/10.21105.joss.05313.pdf differ diff --git a/joss.05313/media/images/fig1_multiqc.png b/joss.05313/media/images/fig1_multiqc.png new file mode 100644 index 0000000000..af52ba2051 Binary files /dev/null and b/joss.05313/media/images/fig1_multiqc.png differ