forked from saezlab/decoupleR
-
Notifications
You must be signed in to change notification settings - Fork 1
/
README.Rmd
181 lines (140 loc) · 6.48 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# decoupleR <img src="man/figures/logo.png" align="right" width="120" />
<!-- badges: start -->
[![Lifecycle: maturing](https://img.shields.io/badge/lifecycle-maturing-blue.svg)](https://www.tidyverse.org/lifecycle/#maturing)
[![BioC status](http://www.bioconductor.org/shields/build/release/bioc/decoupleR.svg)](https://bioconductor.org/checkResults/release/bioc-LATEST/decoupleR)
[![BioC dev status](http://www.bioconductor.org/shields/build/devel/bioc/decoupleR.svg)](https://bioconductor.org/checkResults/devel/bioc-LATEST/decoupleR)
[![R build status](https://github.com/saezlab/decoupleR/workflows/R-CMD-check-bioc/badge.svg)](https://github.com/saezlab/decoupleR/actions)
[![Codecov test coverage](https://codecov.io/gh/saezlab/decoupleR/branch/master/graph/badge.svg)](https://codecov.io/gh/saezlab/decoupleR?branch=master)
[![GitHub issues](https://img.shields.io/github/issues/saezlab/decoupleR)](https://github.com/saezlab/decoupleR/issues)
<!-- badges: end -->
## Overview
Many methods allow us to extract biological activities from omics data using
information from prior knowledge resources, reducing the dimensionality for
increased statistical power and better interpretability.
Here, we present decoupleR, a Bioconductor package containing different
statistical methods to extract these signatures within a unified framework.
decoupleR allows the user to flexibly test any method with any resource.
It incorporates methods that take into account the sign and weight of
network interactions. decoupleR can be used with any omic, as long as its
features can be linked to a biological process based on prior knowledge.
For example, in transcriptomics gene sets regulated by a transcription
factor, or in phospho-proteomics phosphosites that are targeted by a kinase.
<img src="https://github.com/saezlab/decoupleR/blob/master/inst/figures/graphical_abstract.png?raw=1" align="center" width="800">
For more information about how this package has been used with real data,
please check the following links:
- [decoupleR's manuscript repository](https://github.com/saezlab/decoupleR_manuscript)
- [Creation of benchmarking pipelines](https://github.com/saezlab/decoupleRBench)
- [Example of Kinase and TF activity estimation](https://saezlab.github.io/kinase_tf_mini_tuto/)
## Installation instructions
Get the latest stable `R` release from
[CRAN](http://cran.r-project.org/).
Then install **decoupleR** using from [Bioconductor](http://bioconductor.org/) the
following code:
```{r bioconductor_install, eval=FALSE}
if (!requireNamespace("BiocManager", quietly = TRUE)) {
install.packages("BiocManager")
}
BiocManager::install("decoupleR")
# Check that you have a valid Bioconductor installation
BiocManager::valid()
```
Then install development version from [GitHub](https://github.com/) with:
```{r github_installation_bioc, eval = FALSE}
BiocManager::install("saezlab/decoupleR")
```
Or with:
```{r github_installation_devt, eval = FALSE}
devtools::install_github("saezlab/decoupleR")
```
## Usage
### Load package and data
We first load the test data included inside **decoupleR**. It consist of a matrix
(`mat`) with logFC coming from transcriptomics, and a collection of transcription
factors that target gene sets with a certain mode of regulation (either positive
or negative).
```{r usage-load_data, message=FALSE}
library(decoupleR)
inputs_dir <- system.file("testdata", "inputs", package = "decoupleR")
mat <- file.path(inputs_dir, "input-expr_matrix.rds") %>%
readRDS() %>%
dplyr::glimpse()
network <- file.path(inputs_dir, "input-dorothea_genesets.rds") %>%
readRDS() %>%
dplyr::glimpse()
```
**Important**: Before running any method in **decoupleR**, we recommend the user to
intersect their prior knowledge with their input matrix using
`intersect_regulons`. This allows to filter out "regulons" (biological processes)
with less than a minimum number of target features. We recommend to set this
value (`minsize`) to at least 5:
```{r usage-intersect_regulons, message=FALSE}
# Remove TFs with less than 5 targets in the input matrix
network <- intersect_regulons(mat, network, tf, target, minsize = 5)
```
### Methods
To check how many methods are currently available in **decoupleR**, run:
```{r usage-show_methods}
show_methods()
```
Function is the function name in **decoupleR** and Name is the full name of each
method. To check how to use individual methods, for example `mlm`, run
`?run_mlm`.
All methods follow the same design pattern and arguments, so moving between
methods should be easy. Here is an example with `mlm`:
```{r usage-run_mlm}
run_mlm(
mat = mat,
network = network,
.source = "tf",
.target = "target",
.mor = "mor",
.likelihood = "likelihood"
)
```
### Decouple wrapper
Moreover, **decoupleR** allows to run multiple methods at the same time with the
function `decouple()`.
Statistic functions inside **decoupleR** always return a tidy tibble that can be
easily processed with the tools provide by the
[tidyverse ecosystem](https://www.tidyverse.org/).
```{r usage-decouple_function, message=FALSE}
decouple(
mat = mat,
network = network,
.source = "tf",
.target = "target",
statistics = c("mlm", "wmean", "ulm", "ora"),
args = list(
mlm = list(center=FALSE),
wmean = list(times=100),
ulm = list(center=FALSE),
ora = list(n_up=150, n_bottom=0)
),
consensus_score = TRUE
)
```
It can generate a consensus score between the methods if `consensus_score = TRUE`.
## Citation
Badia-i-Mompel P., Vélez J., Braunger J., Geiss C., Dimitrov D., Müller-Dott S., Taus P., Dugourd A., Holland
C.H., Ramirez Flores R.O. and Saez-Rodriguez J. 2021. decoupleR: Ensemble of computational methods to infer
biological activities from omics data. bioRxiv. https://doi.org/10.1101/2021.11.04.467271
## Contributing to decoupleR
Are you interested in adding a new statistical method or collaborating in the
development of internal tools that allow the extension of the package?
Please check out our
[contribution guide](https://saezlab.github.io/decoupleR/CONTRIBUTING.html).
---
Please note that this project is released with a [Contributor Code of Conduct](https://saezlab.github.io/decoupleR/CODE_OF_CONDUCT).
By participating in this project you agree to abide by its terms.