Skip to content

Commit

Permalink
Version 1.1
Browse files Browse the repository at this point in the history
  • Loading branch information
fouodo committed Nov 27, 2024
1 parent 03e978a commit a0cc4c8
Show file tree
Hide file tree
Showing 3 changed files with 107 additions and 29 deletions.
34 changes: 21 additions & 13 deletions README.Rmd
Original file line number Diff line number Diff line change
@@ -1,12 +1,3 @@
---
title: "fuseMLR"
author: Cesaire J. K. Fouodo
output:
md_document:
variant: gfm
preserve_yaml: true
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
Expand Down Expand Up @@ -192,11 +183,21 @@ print(training)

Use `extractModel` to retrieve the list of stored models and `extractData` to retrieve training data.


```{r basic_lrnr, include=TRUE, eval=TRUE}
models_list <- extractModel(training = training)
data_list <- extractData(training = training)
print(str(object = models_list, max.level = 1L))
```

The list of four models (three random forests and one weighted meta-model) trained on each layer is returned.

```{r basic_data, include=TRUE, eval=TRUE}
data_list <- extractData(object = training)
str(object = data_list, max.level = 1)
```

The list of the four training data (the three simulated training modalities and the meta-data) is returned.

#### E) Predicting

In this section, we create a ```testing``` instance (from the *Testing* class) and make predictions for new data. This is done analogously to ```training```. The only difference that only the testing data modalities are required. Relevant functions are ```createTesting()``` and ```createTestLayer()```.
Expand All @@ -222,10 +223,18 @@ createTestLayer(testing = testing,
test_data = multi_omics$testing$proteinexpr)
```

- An upset plot of the training data: Visualize patient overlap across layers.
A look on testing data.

```{r basic_test_data, include=TRUE, eval=TRUE}
data_list <- extractData(object = testing)
str(object = data_list, max.level = 1)
```

An upset plot of the training data: Visualize patient overlap across layers.

```{r upsetplot_new, include=TRUE, eval=TRUE, }
upsetplot(object = testing, order.by = "freq")
# See also extractData(testing = testing)
```

- Predict the testing object.
Expand Down Expand Up @@ -267,7 +276,6 @@ perf_overlapping <- sapply(X = actual_pred[complete.cases(actual_pred),
print(perf_overlapping)
```

Note that our example is based on simulated data for usage illustration; only one run is not enough to appreciate the performances of our models.

# E - Interface and wrapping #

Expand Down Expand Up @@ -371,7 +379,7 @@ library(knitr)
# Create a data frame
data <- data.frame(
Leaner = c("weightedMeanLearner", "bestSpecificLearner"),
Leaner = c("weightedMeanLearner", "bestLayerLearner"),
Description = c("The weighted mean meta learner. It uses meta data to estimate the weights of the modality-specific models", "The best layer-specific model is used as meta model.")
)
Expand Down
78 changes: 67 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,12 @@
---
title: "fuseMLR"
author: Cesaire J. K. Fouodo
output:
md_document:
variant: gfm
preserve_yaml: true
---

<!-- badges: start -->

[![R-CMD-check](https://github.com/imbs-hl/fuseMLR/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/imbs-hl/fuseMLR/actions/workflows/R-CMD-check.yaml)
Expand Down Expand Up @@ -164,7 +173,7 @@ createTrainLayer(training = training,
mtry = 1L,
na.action = "na.learn"),
param_pred_list = list(),
na_rm = FALSE)
na_action = "na.keep")
```

## Training : training
Expand Down Expand Up @@ -192,7 +201,7 @@ createTrainLayer(training = training,
mtry = 1L,
na.action = "na.learn"),
param_pred_list = list(),
na_rm = FALSE)
na_action = "na.keep")
```

## Training : training
Expand Down Expand Up @@ -220,7 +229,7 @@ createTrainLayer(training = training,
mtry = 1L,
na.action = "na.learn"),
param_pred_list = list(),
na_rm = FALSE)
na_action = "na.keep")
```

## Training : training
Expand Down Expand Up @@ -327,6 +336,9 @@ training <- fusemlr(training = training,
k = 10L))
```

## Warning in fusemlr(training = training, use_var_sel = TRUE, resampling_method =
## NULL, : Variable selection has been already performed.

## Training for fold 1.

## Training on layer geneexpr started.
Expand Down Expand Up @@ -488,6 +500,7 @@ print(training)
## Status : Trained
## Number of layers: 4
## Layers trained : 4
## Var. sel. used : Yes
## p : 131 | 160 | 160 | 3
## n : 50 | 50 | 50 | 64

Expand All @@ -500,9 +513,34 @@ Use `extractModel` to retrieve the list of stored models and

``` r
models_list <- extractModel(training = training)
data_list <- extractData(training = training)
print(str(object = models_list, max.level = 1L))
```

## List of 4
## $ geneexpr :List of 14
## $ proteinexpr:List of 14
## $ methylation:List of 14
## $ meta_layer : 'weightedMeanLearner' Named num [1:3] 0.512 0.276 0.212
## ..- attr(*, "names")= chr [1:3] "geneexpr" "proteinexpr" "methylation"
## NULL

The list of four models (three random forests and one weighted
meta-model) trained on each layer is returned.

``` r
data_list <- extractData(object = training)
str(object = data_list, max.level = 1)
```

## List of 4
## $ geneexpr :'data.frame': 50 obs. of 133 variables:
## $ proteinexpr:'data.frame': 50 obs. of 162 variables:
## $ methylation:'data.frame': 50 obs. of 162 variables:
## $ meta_layer :'data.frame': 64 obs. of 5 variables:

The list of the four training data (the three simulated training
modalities and the meta-data) is returned.

#### E) Predicting

In this section, we create a `testing` instance (from the *Testing*
Expand Down Expand Up @@ -551,15 +589,31 @@ createTestLayer(testing = testing,
## p : 131 | 160 | 160
## n : 20 | 20 | 20

- An upset plot of the training data: Visualize patient overlap across
layers.
A look on testing data.

``` r
data_list <- extractData(object = testing)
str(object = data_list, max.level = 1)
```

## List of 3
## $ geneexpr :'data.frame': 20 obs. of 132 variables:
## $ proteinexpr:'data.frame': 20 obs. of 161 variables:
## $ methylation:'data.frame': 20 obs. of 161 variables:

An upset plot of the training data: Visualize patient overlap across
layers.

``` r
upsetplot(object = testing, order.by = "freq")
```

![](README_files/figure-gfm/upsetplot_new-1.png)<!-- -->

``` r
# See also extractData(testing = testing)
```

- Predict the testing object.

``` r
Expand Down Expand Up @@ -640,9 +694,6 @@ print(perf_overlapping)
## geneexpr proteinexpr methylation meta_layer
## 0.3093583 0.3448970 0.2932064 0.2993118

Note that our example is based on simulated data for usage illustration;
only one run is not enough to appreciate the performances of our models.

# E - Interface and wrapping

We distinguish common supervised learning arguments from method specific
Expand Down Expand Up @@ -682,7 +733,7 @@ createTrainLayer(training = training,
kernel = 'radial',
probability = TRUE),
param_pred_list = list(probability = TRUE),
na_rm = TRUE,
na_action = "na.keep",
x = "x",
y = "y",
object = "object",
Expand All @@ -699,6 +750,7 @@ createTrainLayer(training = training,
## Status : Trained
## Number of layers: 4
## Layers trained : 4
## Var. sel. used : Yes
## p : 131 | 160 | 160 | 3
## n : 50 | 50 | 50 | 64

Expand Down Expand Up @@ -726,6 +778,9 @@ training <- fusemlr(training = training,
use_var_sel = TRUE)
```

## Warning in fusemlr(training = training, use_var_sel = TRUE): Variable selection
## has been already performed.

## Training for fold 1.

## Training on layer geneexpr started.
Expand Down Expand Up @@ -887,6 +942,7 @@ print(training)
## Status : Trained
## Number of layers: 4
## Layers trained : 5
## Var. sel. used : Yes
## p : 131 | 160 | 160 | 3
## n : 50 | 50 | 50 | 64

Expand Down Expand Up @@ -947,7 +1003,7 @@ implemented the following ones.
| Leaner | Description |
|:--------------------|:----------------------------------------------------------------------------------------------------------|
| weightedMeanLearner | The weighted mean meta learner. It uses meta data to estimate the weights of the modality-specific models |
| bestSpecificLearner | The best layer-specific model is used as meta model. |
| bestLayerLearner | The best layer-specific model is used as meta model. |

© 2024 Institute of Medical Biometry and Statistics (IMBS). All rights
reserved.
24 changes: 19 additions & 5 deletions vignettes/how_to_use.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -165,9 +165,17 @@ We use `extractModel()` to retrieve the list of stored models and `extractData()

```{r basic_lrnr, include=TRUE, eval=TRUE}
models_list <- extractModel(training = training)
data_list <- extractData(training = training)
print(str(object = models_list, max.level = 1L))
```

The list of four models (three random forests and one weighted meta-model) trained on each layer is returned.

```{r basic_data, include=TRUE, eval=TRUE}
data_list <- extractData(object = training)
str(object = data_list, max.level = 1)
```

The list of the four training data (the three simulated training modalities and the meta-data) is returned.

# D - Predicting #

Expand All @@ -194,6 +202,13 @@ createTestLayer(testing = testing,
test_data = multi_omics$testing$methylation)
```

A look on testing data.

```{r basic_test_data, include=TRUE, eval=TRUE}
data_list <- extractData(object = testing)
str(object = data_list, max.level = 1)
```

We can also generate an upset plot to visualize patient overlap across testing layers.

```{r upsetplot_new, include=TRUE, eval=TRUE, }
Expand Down Expand Up @@ -238,7 +253,6 @@ perf_overlapping <- sapply(X = actual_pred[complete.cases(actual_pred),
print(perf_overlapping)
```

Note that our example is based on simulated data for usage illustration; only one run is not enough to appreciate the performances of our models.

# E - Interface and wrapping #

Expand Down Expand Up @@ -294,12 +308,12 @@ mylasso <- function (x, y,
nlambda = 25,
nfolds = 5) {
# Perform cross-validation to find the optimal lambda
cv_lasso <- cv.glmnet(x = as.matrix(x), y = y,
cv_lasso <- glmnet::cv.glmnet(x = as.matrix(x), y = y,
family = "binomial",
type.measure = "deviance",
nfolds = nfolds)
best_lambda <- cv_lasso$lambda.min
lasso_best <- glmnet(x = as.matrix(x), y = y,
lasso_best <- glmnet::glmnet(x = as.matrix(x), y = y,
family = "binomial",
alpha = 1,
lambda = best_lambda
Expand Down Expand Up @@ -342,7 +356,7 @@ library(knitr)
# Create a data frame
data <- data.frame(
Leaner = c("weightedMeanLearner", "bestSpecificLearner"),
Leaner = c("weightedMeanLearner", "bestLayerLearner"),
Description = c("The weighted mean meta learner. It uses meta data to estimate the weights of the modality-specific models", "The best layer-specific model is used as meta model.")
)
Expand Down

0 comments on commit a0cc4c8

Please sign in to comment.