Skip to content
SionBayliss edited this page Feb 27, 2019 · 7 revisions

Cataloguing genes and their distributions within natural bacterial populations is essential for understanding evolutionary processes and the genetic bases of adaptation. genes that are shared between different bacterial strains and species is essential for understanding the genomic variation that underlies the enormous phenotypic variation observed in the microbial world. Here we present a pangenomics toolbox, PIRATE, which identifies and classifies orthologous gene families in bacterial pangenomes over a wide range of sequence similarity thresholds. PIRATE builds upon recent scalable software developments for the rapid interrogation of pangenomes from large dat thousands of genomes. PIRATE clusters genes (or other annotated features) over a wide range of amino-acid or nucleotide identity thresholds, and classifies paralogous genes families into either putative gene fission/fusion events or gene duplications. Furthermore, PIRATE provides a measure of allelic variance and cluster homology, and orders the resulting pangenome on a pangenome graph. Additional scripts are provided for comparison and visualization. PIRATE provides a robust framework for analysing the pangenomes of bacteria, from largely clonal to panmictic species.

Availability and implementation

PIRATE is implemented in Perl and is freely available under an GNU GPL 3 open source license from https://github.com/SionBayliss/PIRATE.

Contact: s.bayliss (AT) bath.ac.uk

Additional Details

The PIRATE wiki contains additional information on the methodology and outputs.

Referencing

An application note is in preparation/submission.

Clone this wiki locally