piranha v1.2
Release notes
- Update to how phylo supplementary data are handled.
An optional set of local sequences can be supplied to supplement the phylogenetic analysis. To supply them to piranha, point to the correct directory using -sd,--supplementary-datadir. The sequence files should be in FASTA format, but do not need to be aligned. To allow piranha to assign the sequences to the relevant phylogeny, the sequence files should have the reference group annotated in the header in the format display_name=Sabin1-related, for example.
This supplementary sequence files can be accompanied with csv metadata files (one row per supplementary sequence) and this metadata can be included in the final report and annotated onto the phylogenies (-smcol/--supplementary-metadata-columns). By default, the metadata is matched to the FASTA sequence name with a column titled sequence_name but this header name can be configured by specifying -smid/--supplementary-metadata-id-column.
Piranha will iterate accross the directory supplied and amalgamate the FASTA files, retaining any sequences with display_name=X in the header description, where X can be one of Sabin1-related, Sabin2-related, Sabin3-related or WPV1. It then will read in every csv file it detects in this directory and attempts to match any metadata to the gathered fasta records. These will be added to the relevant phylogenies.
- Update to how local database is updated
If you supply a path to the -sd,--supplementary-datadir for the phylogenetics module, you have the option of updating this data directory with the new consesnsus sequences generated during the piranaha analysis. If you run with the -ud,--update-local-database flag, piranha will write out the new sequences and any accompanying metadata supplied into the directory provided.
The files written out will be in the format runname.today.fasta and runname.today.csv. For example, if your runname supplied is MIN001 and today's date is 2023-11-05, the files written will be:
MIN001.2023-11-05.csv with the newly generated consensus sequences and accompanying metadata from that run.
Note: if supplying the supplementary directory to piranha on a subsequent run, your updated local database will be included in the phylogenetics. However, piranha will ignore any files with identical runname.today patterns to the active run. So, if your current run would produce files called MIN001.2023-11-05.fasta and MIN001.2023-11-05.csv, if those files already exist in the supplementary data directory, they will be ignored. This is to avoid conflicts if piranha is run multiple times on the same data.
- Piranha now runs on EPI2ME
- Phylo pipeline added to github actions