This repository contains code used in the multiomic analyses of faecal microbiota from four families with several cases of T1DM ( MuSt ).
You can find the different scripts behind the links next to the bullet points below. The links in the sub-headings below lead to descriptions of the workflows which connect the different scripts.
to build a search data base for proteomics from predicted proteins and their variants:
- rename4proteomics.pl
- trypsinStartEndProdigal.pl (corrected version)
- variant_annotateRepairedTabProdigal.pl (corrected version)
- variants_annotateTab4StatsProdigal.pl (corrected version)
- trypsinStartEnd.pl (old version)
- variant_annotateRepairedTab.pl (old version)
- variant_annotateRepairedTabProdigalStillWrong.pl (version to keep workflow)
- variants_annotateTab4Stats.pl (old version)
- variants_locateType.pl
to parse functional annotations of gene predictions (some including coverage):
- 150310_MUST_hmmBestAll.py
- 150705_MUST_hmmParse.py
- 150705_MUST_hmmParsePfam.py
- consolidate_hmmscan_results.pl
- consolidate_hmmscan_results_justKEGG.pl
- 150705_MUST_keggParseNW.py
- ko2des_clean.txt - pretty big text file
- calculateCoverageAndGaps2.pl
- 150322_bestHmmReadParse.py
- 150415_bestHmmAveCovParse.py
- 150630_keggReadParse.py
to annotate phylogenetic marker genes with the taxonomy of the best hit from the mOTU database:
to parse taxonomy of MG-RAST annotations of genes:
to automatically cluster contigs based on nucleotide signature (BH-SNE maps), DNA coverage and essential genes:
to gather contig clusters by related phylogenetic marker genes in a phylogenetic tree:
to reconstruct a metabolic network from KOs and analyse it:
- 140630_MUST_NW.R
- the above script needs file 150705_KOs_in_NW.tsv
- runHeinz.sh
- plotModules_omicLevels.R