3.2.0
What's Changed
Important CRAM writing bug fix
A serious bug which can cause corrupted reads in CRAM files was discovered and fixed in this release. This bug was introduced in HTSJDK 3.0.0, and affects Picard versions 2.27.3 through 3.1.1 and GATK versions 4.3 through 4.5.
The bug occurs in cases where there is a read aligned starting at exactly position 1 on a reference contig. This means that the bug doesn't generally impact human autosome and X/Y contigs because they tend to start with a large number of N
bases and reads are not aligned at exactly position 1. The exceptions to this would be T2T references and things like mitochondrial calling.
For more information on the conditions that trigger this bug, see this post.
GATK 4.6 includes a tool called CRAMIssue8768Detector
that can scan a CRAM file and report whether it is affected, and if so which regions in the file are corrupt. If you suspect that some of your CRAM files may have been affected, please run this tool on them for confirmation!
See also samtools/htsjdk#1708 for more information.
Better support for remote files
Improvements to allow direct access to remote files continue. It's now possible to use a remote reference file without localizing it in many cases. Files which are available through http URLs are now accessible directly as well. (ex: https://example.com/my.bam
). This allows direct access to signed URLs, although index and supporting files may not be discoverable automatically.
- Enable use of cloud reference files by @cmnbroad in #1804
- Adding http-nio as a dependency by @lbergelson in #1929
New features for flow based reads
- MarkDuplicates strategy of flow based reads that looks only at the qualities close to the end of the read by @ilyasoifer in #1942
- CollectQualityYieldMetricsFlowSpace tool by @dror27 in #1932
New Options
KEEP_ZERO_LENGTH_INTERVALS
flag when converting bed -> interval_list by @rickymagner in #1928- Make the
VCF
option in CollectSamErrorMetrics optional. by @nh13 in #1476 - Add the
EXT
argument to CollectSamErrorMetrics. by @nh13 in #1478
Bug fixes
Bug fixes to several tools as well as important CRAM fixes from an updated htsjdk
- MarkDuplicates: Add read group ID instead of string "RG" by @michaelgatzen in #1937
- Fix CollectHSMetrics - Don't use Coverage Cap on fold score by @JoeVieira in #1913
- Fix for order flipping in SortingCollection used for MarkDuplicates by @wook-choi in #1945
- Allow fingerprinting of SAM files that only have a partial dictionary match to the haplotype map by @yfarjoun in #1955
- Fix a bug in the liftover logic by @yfarjoun in #1956
Documentation and improved error messages
- Reject piped input (/dev/stdin) for BedToIntervalList by @kockan in #1918
- Update AbstractAlignmentMerger.java Warning Message for Cross Species Contamination by @gokalpcelik in #1960
- MergeBamAlignment documentation by @kachulis in #1922
- Updated SamToFastq documentation by @kockan in #1920
Maintenance and dependency updates
- Update setup_gcloud github action by @lbergelson in #1939
- Update Dockerfile and build_push_docker.sh by @lbergelson in #1921
- Update Gradle to 8.5 by @lbergelson in #1930
- Convert to use the java-library plugin by @lbergelson in #1934
- update htsjdk to 4.1.1 by @lbergelson in #1967
- Update http-nio to 1.1.1 by @droazen in #1968
New Contributors
- @JoeVieira made their first contribution in #1913
- @wook-choi made their first contribution in #1945
- @dror27 made their first contribution in #1932
- @gokalpcelik made their first contribution in #1960
Full Changelog: 3.1.1...3.2.0