Skip to content

Commit

Permalink
Release 1.13
Browse files Browse the repository at this point in the history
  • Loading branch information
valeriuo committed Jul 7, 2021
2 parents 06e3645 + 186dc93 commit fc13b08
Show file tree
Hide file tree
Showing 127 changed files with 18,134 additions and 12,865 deletions.
2 changes: 1 addition & 1 deletion HMM.c
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
/* The MIT License
Copyright (c) 2014-2015 Genome Research Ltd.
Copyright (c) 2014-2017 Genome Research Ltd.
Author: Petr Danecek <[email protected]>
Expand Down
2 changes: 1 addition & 1 deletion HMM.h
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
/* The MIT License
Copyright (c) 2014-2015 Genome Research Ltd.
Copyright (c) 2014-2016 Genome Research Ltd.
Author: Petr Danecek <[email protected]>
Expand Down
18 changes: 12 additions & 6 deletions INSTALL
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ The latest source code can be downloaded from github and compiled using:
---
In order to use the BCFtools plugins, this environment variable must be set and point
to the correct location

export BCFTOOLS_PLUGINS=/path/to/bcftools/plugins

---
Expand Down Expand Up @@ -100,14 +100,20 @@ Compilation
./configure
make

If installing from a tarballs (as opposed form github), BCFtools release
contains a copy of HTSlib which will be used to build BCFtools. If you
If installing from a release (as opposed to from GitHub), the BCFtools release
tarball contains a copy of HTSlib which will be used to build BCFtools. If you
already have a system-installed HTSlib or another HTSlib that you would
prefer to build against, you can arrange this by using the configure script's
--with-htslib option. Use --with-htslib=DIR to point to an HTSlib source tree
or installation in DIR; or --with-htslib=system to use a system-installed HTSlib.
When downloaded from github and --with-htslib option is not given, the directory
../htslib is used.
or installation in DIR (if the desired source tree has been configured to
build in a separate build directory, DIR should refer to the build directory);
or use --with-htslib=system to ignore any nearby HTSlib source tree and use
only a system-installed HTSlib.

When --with-htslib is not used, configure looks for an HTSlib source tree
within or alongside the BCFtools source directory; if there are several
likely candidates, you will have to use --with-htslib to choose one. When
using make without running configure first, the directory ../htslib is used.


Optional Compilation with Perl
Expand Down
27 changes: 26 additions & 1 deletion LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ the INSTALL document), the use of this software is governed by the GPL license.

The MIT/Expat License

Copyright (C) 2012-2014 Genome Research Ltd.
Copyright (C) 2012-2021 Genome Research Ltd.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down Expand Up @@ -746,3 +746,28 @@ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

-----------------------------------------------------------------------------

LICENSE for utlist.h

Copyright (c) 2007-2014, Troy D. Hanson http://troydhanson.github.com/uthash/
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS
IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER
OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
15 changes: 8 additions & 7 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Makefile for bcftools, utilities for Variant Call Format VCF/BCF files.
#
# Copyright (C) 2012-2017 Genome Research Ltd.
# Copyright (C) 2012-2021 Genome Research Ltd.
#
# Author: Petr Danecek <[email protected]>
#
Expand Down Expand Up @@ -42,7 +42,7 @@ OBJS = main.o vcfindex.o tabix.o \
regidx.o smpl_ilist.o csq.o vcfbuf.o \
mpileup.o bam2bcf.o bam2bcf_indel.o bam_sample.o \
vcfsort.o cols.o extsort.o dist.o abuf.o \
ccall.o em.o prob1.o kmin.o
ccall.o em.o prob1.o kmin.o str_finder.o
PLUGIN_OBJS = vcfplugin.o

prefix = /usr/local
Expand Down Expand Up @@ -104,7 +104,7 @@ endif

include config.mk

PACKAGE_VERSION = 1.12
PACKAGE_VERSION = 1.13

# If building from a Git repository, replace $(PACKAGE_VERSION) with the Git
# description of the working tree: either a release tag with the same value
Expand Down Expand Up @@ -233,6 +233,7 @@ abuf_h = abuf.h $(htslib_vcf_h)
bam2bcf_h = bam2bcf.h $(htslib_hts_h) $(htslib_vcf_h)
bam_sample_h = bam_sample.h $(htslib_sam_h)

str_finder.o: str_finder.h utlist.h
main.o: main.c $(htslib_hts_h) config.h version.h $(bcftools_h)
vcfannotate.o: vcfannotate.c $(htslib_vcf_h) $(htslib_synced_bcf_reader_h) $(htslib_kseq_h) $(htslib_khash_str2int_h) $(bcftools_h) vcmp.h $(filter_h) $(convert_h) $(smpl_ilist_h) regidx.h $(htslib_khash_h)
vcfplugin.o: vcfplugin.c config.h $(htslib_vcf_h) $(htslib_synced_bcf_reader_h) $(htslib_kseq_h) $(htslib_khash_str2int_h) $(bcftools_h) vcmp.h $(filter_h)
Expand Down Expand Up @@ -273,9 +274,9 @@ dist.o: dist.c dist.h
cols.o: cols.c cols.h
regidx.o: regidx.c $(htslib_hts_h) $(htslib_kstring_h) $(htslib_kseq_h) $(htslib_khash_str2int_h) regidx.h
consensus.o: consensus.c $(htslib_vcf_h) $(htslib_kstring_h) $(htslib_synced_bcf_reader_h) $(htslib_kseq_h) $(htslib_bgzf_h) regidx.h $(bcftools_h) rbuf.h $(filter_h)
mpileup.o: mpileup.c $(htslib_sam_h) $(htslib_faidx_h) $(htslib_kstring_h) $(htslib_khash_str2int_h) regidx.h $(bcftools_h) $(bam2bcf_h) $(bam_sample_h) $(gvcf_h)
mpileup.o: mpileup.c $(htslib_sam_h) $(htslib_faidx_h) $(htslib_kstring_h) $(htslib_khash_str2int_h) $(htslib_hts_os_h) regidx.h $(bcftools_h) $(bam2bcf_h) $(bam_sample_h) $(gvcf_h)
bam2bcf.o: bam2bcf.c $(htslib_hts_h) $(htslib_sam_h) $(htslib_kstring_h) $(htslib_kfunc_h) $(bam2bcf_h) mw.h
bam2bcf_indel.o: bam2bcf_indel.c $(htslib_hts_h) $(htslib_sam_h) $(htslib_khash_str2int_h) $(bam2bcf_h) $(htslib_ksort_h)
bam2bcf_indel.o: bam2bcf_indel.c $(htslib_hts_h) $(htslib_sam_h) $(htslib_khash_str2int_h) $(bam2bcf_h) $(htslib_ksort_h) str_finder.h
bam_sample.o: bam_sample.c $(htslib_hts_h) $(htslib_kstring_h) $(htslib_khash_str2int_h) $(khash_str2str_h) $(bam_sample_h) $(bcftools_h)
version.o: version.h version.c
hclust.o: hclust.c $(htslib_hts_h) $(htslib_kstring_h) $(bcftools_h) hclust.h
Expand Down Expand Up @@ -320,10 +321,10 @@ test/test-regidx: test/test-regidx.o regidx.o | $(HTSLIB)

# make docs target depends the a2x asciidoc program
doc/bcftools.1: doc/bcftools.txt
cd doc && a2x -adate="$(DOC_DATE)" -aversion=$(DOC_VERSION) --doctype manpage --format manpage bcftools.txt
cd doc && asciidoctor -adate="$(DOC_DATE)" -aversion=$(DOC_VERSION) -b manpage -a linkcss -a stylesheet=docbook-xsl.css bcftools.txt

doc/bcftools.html: doc/bcftools.txt
cd doc && a2x -adate="$(DOC_DATE)" -aversion=$(DOC_VERSION) --doctype manpage --format xhtml bcftools.txt
cd doc && asciidoctor -adate="$(DOC_DATE)" -aversion=$(DOC_VERSION) -b html5 -a linkcss -a stylesheet=docbook-xsl.css bcftools.txt

docs: doc/bcftools.1 doc/bcftools.html

Expand Down
161 changes: 160 additions & 1 deletion NEWS
Original file line number Diff line number Diff line change
@@ -1,5 +1,164 @@
## Release 1.12 (17th March 2021)
## Release 1.13 (7th July 2021)


This release brings new options and significant changes in BAQ parametrization
in `bcftools mpileup`. The previous behavior can be triggered by providing
the `--config 1.12` option. Please see https://github.com/samtools/bcftools/pull/1474
for details.


Changes affecting the whole of bcftools, or multiple commands:

* Improved build system


Changes affecting specific commands:

* bcftools annotate:

- Fix rare a bug when INFO/END is present, all INFO fields are removed
with `bcftools annotate -x INFO` and BCF output is produced. Then the
removed INFO/END continues to inform the end coordinate and causes
incorrect retrieval of records with the -r option (#1483)

- Support for matching annotation line by ID, in addition to CHROM,POS,REF,
and ALT (#1461)

bcftools annotate -a annots.tab.gz -c CHROM,POS,~ID,REF,ALT,INFO/END input.vcf

* bcftools csq:

- When GFF and VCF/fasta use a different chromosome naming convention
(e.g. chrX vs X), no consequences would be added. Newly the program
attempts to detect these differences and remove/add the "chr" prefix
to chromosome name to match the GFF and VCF/fasta (#1507)

- Parametrize brief-predictions parameter to allow explicit number of
aminoacids to be printed. Note that the `-b, --brief-predictions` option
is being replaced with `-B, --trim-protein-seq INT`

* bcftools +fill-tags:

- Generalization and better support for custom functions that allow
adding new INFO tags based on arbitrary `-i, --include` type of
expressions. For example, to calculate a missing INFO/DP annotation
from FORMAT/AD, it is possible to use:

-t 'DP:1=int(sum(FORMAT/AD))'

Here the optional ":1" part specifies that a single value will be
added (by default Number=. is used) and the optional int(...) adds
an integer value (by default Type=Float is used).

- When FORMAT/GT is not present, the INFO/AF tag will be newly calculated
from INFO/AC and INFO/AN.

* bcftools gtcheck:

- Switch between FORMAT/GT or FORMAT/PL when one is (implicitly) requested
but only the other is available

- Improve diagnostics, printing warnings when a line cannot be matched and
the number of lines skipped for various reasons (#1444)

- Minor bug fix, with PLs being the default, the `--distinctive-sites` option
started to require explicit `--error-probability 0`

* bcftools index:

- The program now accepts both data file name and the index file name. This
adds to user convenience when running index statistics (-n, -s)

* bcftools isec:

- Always generate sites.txt with isec -p (#1462)

* bcftools +mendelian:

- Consider only complete trios, do not crash on sample name typos (#1520)

* bcftools mpileup:

- New `--seed` option for reproducibility of subsampling code in HTSlib

- The SCR annotation which shows the number of soft-clipped reads now
correctly pools reads together regardless of the variant type. Previously
only reads with indels were included at indel sites.

- Major revamp of BAQ. Please see https://github.com/samtools/bcftools/pull/1474
for details. The previous behavior can be triggered by providing the `--config 1.12`
option.

- Thanks to improvements in HTSlib, the removal of overlapping reads (which can
be disabled with the `-x, --ignore-overlaps` options) is not systematically biased
anymore (https://github.com/samtools/htslib/pull/1273)

- Modified scale of Mann-Whitney U tests. Newly INFO/*Z annotations will be printed,
for example MQBZ replaces MQB.

* bcftools norm:

- Fix Type=Flag output in `norm --atomize` (#1472)

- Atomization must not discard ALT=. records

- Atomization of AD and QS tags now correctly updates occurrences of duplicate
alleles within different haplotypes

- Fix a bug in atomization of Number=A,R tags

* bcftools reheader:

- Add `-T, --temp-prefix` option

* bcftools +setGT:

- A wider range of genotypes can be set by the plugin by allowing
specifying custom genotypes. For example, to force a heterozygous
genotype it is now possible to use expressions like:

c:'m|M'
c:0/1
c:0

* bcftools +split-vep:

- New `-u, --allow-undef-tags` option

- Better handling of ambiguous keys such as INFO/AF and CSQ/AD. The
`-p, --annot-prefix` option is now applied before doing anything else
which allows its use with `-f, --format` and `-c, --columns` options.

- Some consequence field names may not constitute a valid tag name, such
as "pos(1-based)". Newly field names are trimmed to exclude brackets.

* bcftools +tag2tag:

- New --QR-QA-to-QS option to convert annotations generated by Freebays
to QS used by BCFtools

* bcftools +trio-dnm:

- Add support for sites with more than four alleles. Note that only the
four most frequent alleles are considered, the model remains unchanged.
Previously such sites were skipped.

- New --use-NAIVE option for a naive DNM calling based solely on FORMAT/GT
and expected Mendelian inheritance. This option is suitable for prefiltering.

- Fix behavior to match the documentation, the `--dnm-tag DNG` option now
correctly outputs log scaled values by default, not phred scaled.

- Fix bug in VAF calculation, homozygous de novo variants were incorrectly
reported as having VAF=50%

- Fix arithmetic underflow which could lead to imprecise scores and improve
sensitivity in high coverage regions

- Allow combining --pn and --pns to set the noise trehsholds independently


## Release 1.12 (17th March 2021)

Changes affecting the whole of bcftools, or multiple commands:

Expand Down
22 changes: 22 additions & 0 deletions README
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,25 @@ SAMtools) and manipulating VCF and BCF files. The program is intended
to replace the Perl-based tools from vcftools.

See INSTALL for building and installation instructions.

Please cite this paper when using BCFtools for your publications:

Twelve years of SAMtools and BCFtools
Petr Danecek, James K Bonfield, Jennifer Liddle, John Marshall, Valeriu Ohan, Martin O Pollard, Andrew Whitwham, Thomas Keane, Shane A McCarthy, Robert M Davies, Heng Li
GigaScience, Volume 10, Issue 2, February 2021, giab008, https://doi.org/10.1093/gigascience/giab008

@article{10.1093/gigascience/giab008,
author = {Danecek, Petr and Bonfield, James K and Liddle, Jennifer and Marshall, John and Ohan, Valeriu and Pollard, Martin O and Whitwham, Andrew and Keane, Thomas and McCarthy, Shane A and Davies, Robert M and Li, Heng},
title = "{Twelve years of SAMtools and BCFtools}",
journal = {GigaScience},
volume = {10},
number = {2},
year = {2021},
month = {02},
abstract = "{SAMtools and BCFtools are widely used programs for processing and analysing high-throughput sequencing data. They include tools for file format conversion and manipulation, sorting, querying, statistics, variant calling, and effect analysis amongst other methods.The first version appeared online 12 years ago and has been maintained and further developed ever since, with many new features and improvements added over the years. The SAMtools and BCFtools packages represent a unique collection of tools that have been used in numerous other software projects and countless genomic pipelines.Both SAMtools and BCFtools are freely available on GitHub under the permissive MIT licence, free for both non-commercial and commercial use. Both packages have been installed \\&gt;1 million times via Bioconda. The source code and documentation are available from https://www.htslib.org.}",
issn = {2047-217X},
doi = {10.1093/gigascience/giab008},
url = {https://doi.org/10.1093/gigascience/giab008},
note = {giab008},
eprint = {https://academic.oup.com/gigascience/article-pdf/10/2/giab008/36332246/giab008.pdf},
}
24 changes: 24 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,28 @@ File format specifications live on [HTS-spec GitHub page](http://samtools.github
[samtools](https://github.com/samtools/samtools)
[tabix](https://github.com/samtools/tabix)

### Citing

Please cite this paper when using BCFtools for your publications. http://samtools.github.io/bcftools/howtos/publications.html

> Twelve years of SAMtools and BCFtools </br>
> Petr Danecek, James K Bonfield, Jennifer Liddle, John Marshall, Valeriu Ohan, Martin O Pollard, Andrew Whitwham, Thomas Keane, Shane A McCarthy, Robert M Davies, Heng Li </br>
> _GigaScience_, Volume 10, Issue 2, February 2021, giab008, https://doi.org/10.1093/gigascience/giab008
```
@article{10.1093/gigascience/giab008,
author = {Danecek, Petr and Bonfield, James K and Liddle, Jennifer and Marshall, John and Ohan, Valeriu and Pollard, Martin O and Whitwham, Andrew and Keane, Thomas and McCarthy, Shane A and Davies, Robert M and Li, Heng},
title = "{Twelve years of SAMtools and BCFtools}",
journal = {GigaScience},
volume = {10},
number = {2},
year = {2021},
month = {02},
abstract = "{SAMtools and BCFtools are widely used programs for processing and analysing high-throughput sequencing data. They include tools for file format conversion and manipulation, sorting, querying, statistics, variant calling, and effect analysis amongst other methods.The first version appeared online 12 years ago and has been maintained and further developed ever since, with many new features and improvements added over the years. The SAMtools and BCFtools packages represent a unique collection of tools that have been used in numerous other software projects and countless genomic pipelines.Both SAMtools and BCFtools are freely available on GitHub under the permissive MIT licence, free for both non-commercial and commercial use. Both packages have been installed \\&gt;1 million times via Bioconda. The source code and documentation are available from https://www.htslib.org.}",
issn = {2047-217X},
doi = {10.1093/gigascience/giab008},
url = {https://doi.org/10.1093/gigascience/giab008},
note = {giab008},
eprint = {https://academic.oup.com/gigascience/article-pdf/10/2/giab008/36332246/giab008.pdf},
}
```
Loading

0 comments on commit fc13b08

Please sign in to comment.