BioSankey is a tool for generating Sankey plots from biological data either by using gene expression or microbical data. Sankey plots are suitable to show changes of counts or abundances over time (e.g. gene expression and abundances of microbial species). The plots are produced as an interactive Javascript HTML page and as static PDF plots. Multiple input formats are supported.
OTU;"Taxonomic group";1;2;3;4;5
XYZ1;"Bacteria;Proteobacteria;Gammaproteobacteria;Pseudomonadales;Moraxellaceae;Acinetobacter;";0;73;62;3;1454
XYZ2;"Bacteria;Proteobacteria;Gammaproteobacteria;Enterobacteriales;Enterobacteriaceae;Citrobacter;";4549;23;4;8;37
XYZ3;"Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Bacteroidaceae;Bacteroides;";133489;555;294;1242;443
XYZ4;"Bacteria;Proteobacteria;Gammaproteobacteria;Enterobacteriales;Enterobacteriaceae;Yokenella;";0;68;35;263;709
XYZ5;"Bacteria;Bacteroidetes;Bacteroidia;Bacteroidales;Bacteroidaceae;Bacteroides;";346;23;224;6;66
Python must be installed and a browser must be present, preferably a browser where JavaScript engine is particularly efficient. It was tested under Windows Python 3.6 and does not require additional dependencies except an internet connection to allow to integrate the Google API diagrams.
No particular installation is needed, just the files of this repository need to be downloaded.
In order to generate a project-specific HTML, that allows to query data, either genes or DEG lists, it needs to run the python scripts
A user has to specific the needed files by choosing files from the Graphical User Inferface (starting biosankey.py). There a user can upload expression information, domain information or microbial information.
For demonstration purposes, we used the data from Morandi, Elena, et al. "Gene expression timeseries analysis of camptothecin effects in U87MG and DBTRG05 glioblastoma cell lines." Molecular cancer 7.1 (2008): 66. containing expression information at six different timepoints: 2h, 6h, 16h, 24h, 48h, and 72h.
We also added an use case (Use case 2) where we included OTU from Caporaso JG, Lauber CL, Costello EK, Berg-Lyons D, Gonzalez A, Stombaugh J, Knights D, Gajer P, Ravel J, Fierer N, Gordon JI, and Knight R. 2011. Moving pictures of the human microbiome. Genome Biol 12:R50. 10.1186/gb-2011-12-5-r50
Genes divided in up and downregulated or abundances of genes or other entities (microbial species of a certain taxon, metabolites, ...)
It requires, that all timepoints of interest containing the genes and are also contained in the expression data lists. E.g. genes, that are upregulated at 16h and downregulated would be summarized in the file: '16h_down.dat'.
In order to provide publication-ready images, we suggest to use PhantomJs, which allows to make a screenshot of the current webseite. Link to PhantomJs
phantomjs-2.1.1-linux-i686/bin/phantomjs rasterize.js 'Use_case_combined.html' V1.png "100cm*80cm" 3
convert V1.png -trim V1x1.png
After the OTU Table has been added using the GUI and after clicking on the microbiome Button, the BioSankey plot opens and the taxonomic groups are shown.
Possible by clicking on e.g. 'Neisseria'. All OTUs are then shown.
In order to export Excelt o tsv, we recommend to use xlsx2tsv.
If there are any issues and suggestions, please contact Alexander Platzer ( alexander.platzer AT univie.ac.at ) or Thomas Nussbaumer ( thomas.nussbaumer AT univie.ac.at )
Now, we added
<script>
google.charts.load('current', {packages: ['corechart','sankey']});
</script>
to the existing code, so that BioSankey can still be used with Google Charts. Thanks for M. Mammel for observing this problem of changed loading of JS libraries.
For microbiome datasets, adding information when data is not correctly formatted
Fixing height issue when displaying multiple OTUs.