How to cite this work in a publication:
This project was part of the ASBCB Omics Codeathon.
The advancement of Next Generation Sequencing technologies has generated a large amounts and types of omics data. Up to now, most of the comprehensive molecular profiling studies have been conducted to profile biological samples on a single different layers of genomics activities including DNA, mRNA, proteins, metabolites, and epigenetic features such as DNA methylation and histone post-translational modifications (PTMs). Each of these can be influenced by the heath status of the individual and have an impact on the phenotypes. Nowadays, the emergence of multi-omics data by integrating multiple sources of data will enhance our understanding of the biological system and has also brought new challenges to the development of statistical methods for integration.
Cancer has been implicated to be a major cause of death worldwide. The molecular alterations at different biological information processing levels contribute to the final phenotypic representations in cancer. As such, there is a need to understand the interaction effect of the various information layers, and how they are implicated in cancer onset, progression, and treatment. To solve this problem, we want to leverage the publicly available tumor/normal genomic dataset from TCGA to identify and integrate different omics. The outcome of this integration will then be tied to the clinical variables available from these sources. Finally, this integrated dataset will then be used to build a prediction/classifying model.
The workflow below shows how "MODIBC" works.
Software
Installation
Configuration
Testing
David Adeleke
Margaret Wanjiku
Samuel Nkrumah
Imane Allali
Kholoud Sanak
Samuel Baffoe
Hannah Nyarko