forked from YuLab-SMU/biomedical-knowledge-mining-book
-
Notifications
You must be signed in to change notification settings - Fork 0
/
08_WikiPathways.Rmd
81 lines (49 loc) · 2.77 KB
/
08_WikiPathways.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
# WikiPathways analysis
[WikiPathways](https://www.wikipathways.org) is a continuously updated pathway database curated by a community of researchers and pathway enthusiasts. WikiPathways produces monthly releases of gmt files for supported organisms at [data.wikipathways.org](http://data.wikipathways.org/current/gmt/). The `r Biocpkg("clusterProfiler")` package [@yu2012] supports enrichment analysis (either ORA or GSEA) for WikiPathways using the `enrichWP()` and `gseWP()` functions. These functions will automatically download and parse latest WikiPathways GMT file for selected organism.
Supported organisms can be listed by:
```{r get-wp-organisms}
get_wp_organisms()
```
<!--
Download the appropriate gmt file and then generate `TERM2GENE` and `TERM2NAME` to use `enricher()` and `GSEA()` functions.
Use can download wikiPathways manually:
```r
###################################
# download file manually
# 1. visit the website, http://data.wikipathways.org/current/gmt/, to get the url
# 2. use the following code to download it
####################################
url <- "http://data.wikipathways.org/current/gmt/wikipathways-20200810-gmt-Homo_sapiens.gmt"
wpgmtfile <- "wikiPathways-HS.gmt"
download.file(url, destfile = wpgmtfile)
```
As an alternative to manually downloading gmt files, install the `r Biocpkg("rWikiPathways")` to gain scripting access to the latest gmt files using the `downloadPathwayArchive()` function.
```r
## supported organisms can be accessed via the following command:
## rWikiPathways::listOrganisms()
wpgmtfile <- rWikiPathways::downloadPathwayArchive(organism="Homo sapiens", format = "gmt")
```
Once the GMT file was downloaded, we can use `read.gmt.wp()` to parse it. Note that the `read.gmt.wp()` function is designed for WikiPathways GMT file. For ordinary GMT file, please use the `read.gmt()` function.
```r
wp2gene <- read.gmt.wp(wpgmtfile)
#TERM2GENE
wpid2gene <- wp2gene %>% dplyr::select(wpid, gene)
#TERM2NAME
wpid2name <- wp2gene %>% dplyr::select(wpid, name)
ewp <- enricher(gene, TERM2GENE = wpid2gene, TERM2NAME = wpid2name)
head(ewp)
ewp2 <- GSEA(geneList, TERM2GENE = wpid2gene, TERM2NAME = wpid2name, verbose=FALSE)
head(ewp2)
```
-->
## WikiPathways over-representation analysis {#clusterprofiler-wikipathway-ora}
```{r enrichwp}
data(geneList, package="DOSE")
gene <- names(geneList)[abs(geneList) > 2]
enrichWP(gene, organism = "Homo sapiens")
```
## WikiPathways gene set enrichment analysis {#clusterprofiler-wikipathway-gsea}
```{r gsewp}
gseWP(geneList, organism = "Homo sapiens")
```
If your input gene ID type is not Entrez gene ID, you can use the [`bitr()`](#bitr) function to convert gene ID. If you want to convert the gene IDs in output result to gene symbols, you can use the [`setReadable()`](#setReadable) function.