GitHub - imbs-hl/fuseMLR: Fusing Machine Learning in R

fuseMLR

Cesaire J. K. Fouodo

Introduction

Recent technological advances have enabled the simultaneous collection of multi-omics data, i.e., different types or modalities of molecular data across various organ tissues of patients. For integrative predictive modeling, the analysis of such data is particularly challenging. Ideally, data from the different modalities are measured in the same individuals, allowing for early or intermediate integrative techniques. However, they are often not applicable when patient data only partially overlap, which requires either reducing the datasets or imputing missing values. Additionally, the characteristics of each data modality may necessitate specific statistical methods rather than applying the same method across all modalities. Late integration modeling approaches analyze each data modality separately to obtain modality-specific predictions. These predictions are then integrated to train aggregative models like Lasso, random forests, or compute the weighted mean of modality-specific predictions.

We introduce the R package fuseMLR for late integration prediction modeling. This comprehensive package enables users to define a training process with multiple data modalities and modality-specific machine learning methods. The package is user-friendly, facilitates variable selection and training of different models across modalities, and automatically performs aggregation once modality-specific training is completed. We simulated multi-omics data to illustrate the usage of our new package for conducting late-stage multi-omics integrative modeling.

fuseMLR is an object-oriented package based on R6 version 2.5.1. Refer to the vignette (section Usage below) for a quick overview of how to use the package.

The following figure illustrates the general architecture of fuseMLR:

The general architecture of fuseMLR includes the collection classes Training, TrainLayer, and TrainMetaLayer. TrainLayer and TrainMetaLayer are stored within a Training instance, while TrainData, Lrner, and VarSel (for variable selection) are stored within a TrainLayer or MetaTrainLayer instance. An Training object can be used to automatically conduct layer-specific variable selection and train layer-specfic learner and the meta-learner. Analogously, a Testing can be set up and predicted after the training.

Installation

Install the release version from CRAN with

install.packages("fuseMLR")

Install the development version from GitHub with

devtools::install_github("imbs-hl/fuseMLR")

Usage

Refer to our vignette to understand how fuseMLR works.

Name		Name	Last commit message	Last commit date
Latest commit History 295 Commits
.Rproj.user/shared/notebooks		.Rproj.user/shared/notebooks
.github		.github
R		R
README_files/figure-gfm		README_files/figure-gfm
data		data
doc		doc
man		man
test_code		test_code
tests		tests
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.Rhistory		.Rhistory
.covrignore		.covrignore
DESCRIPTION		DESCRIPTION
NAMESPACE		NAMESPACE
README.Rmd		README.Rmd
README.md		README.md
fuseMLR.Rproj		fuseMLR.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

fuseMLR

Introduction

Installation

Usage

About

Releases 1

Packages

Languages

imbs-hl/fuseMLR

Folders and files

Latest commit

History

Repository files navigation

fuseMLR

Introduction

Installation

Usage

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages