Skip to content

Latest commit

 

History

History
311 lines (248 loc) · 36.9 KB

README.md

File metadata and controls

311 lines (248 loc) · 36.9 KB

ErmineR

This is an R wrapper for Pavlidis Lab’s ermineJ. A tool for gene set enrichment analysis with multifunctionality correction.

Table of Contents

Installation

ermineR requries 64 bit version of java to function. If you are a Mac user make sure you have the java SDK.

After java is installed you can install ermineR by doing

devtools::install_github('PavlidisLab/ermineR')

If ermineR cannot find your java home by itself. Either install rJava or use Sys.setenv(JAVA_HOME=javaHome) to point ermineR to the right path.

Some users report that the ermineJ executable loses its exection privilage after installation. If this happens you will get an error like

"Error in (function (annotation = NULL, aspects = c("Molecular Function",  :
 Something went wrong. Blame the dev
sh: [library installation path]/ermineR/ermineJ-3.1.2/bin/ermineJ.sh: Permission denied "

To fix this just do

chmod +x [library installation path]/ermineR/ermineJ-3.1.2/bin/ermineJ.sh

You may need sudo depending on where you install your packages

Usage

See documentation for ora, roc, gsr, precRecall and corr to see how to use them.

An explanation of what each method does is given. We recommend users start with the precRecall (for gene ranking-based enrichment analysis) or ora (for hit-list over-representation analysis).

Replicable go terms

GO terms are updated frequently so results can differ between versions. The default option of all ermineR functions is to get the latest GO version however this means you may get different results when you repeat the experiment later. If you want to use a specific version of GO, ermineR provides functions to deal with that.

  • goToday: Downloads the latest version of go to a path you provide
  • getGoDates: Lists all dates where a go version is available, from the most recent to oldest
  • goAtDate: Given a valid date, downloads the Go version from a specific date to a file path you provide

To use a specific version of GO, make sure to set geneSetDescription argument of all ermineR functions to the file path where you saved the go terms

Annotations

ErmineR requires annotation files to work. These files include gene identifiers and their Go annotations, along with some optional information. By default, ermineR supports annotation files generated by Gemma. And will automatically download them if you provide a valid annotation name. You can get a list of valid annotation names using listGemmaAnnotations(). As a general rule, if your platform has an identifier in GEO, the identifier that starts with “GPL” is used as the Gemma identifier as well. There are also generic annotation files available that contain all genes from a species. These are typically named something like “Generic_human”.

You can manually download these annotation files from https://gemma.msl.ubc.ca/annots/ or by using the gemma.R::get_platform_annotations function. ErmineR typically uses “noParents” versions of these files since parent terms are derived using the ontology file acquired from GO.

Examples

Use GSR with gene scores

Here we will use a mock scores file located in our tests directory. The score file is specifically created to be enriched in genes with the term GO:0051082.

scores = read.table("tests/testthat/testFiles/pValues", header=T, row.names = 1)
head(scores)
##                pvalue
## 206190_at   0.3163401
## 208385_at   0.5186824
## 65086_at    0.6620389
## 202281_at   0.4068895
## 211622_s_at 0.9128846
## 219257_s_at 0.2936740

This scores file only includes scores for 118 genes. The file was generated using GPL96’s probesets so that is the annotation we’ll be using. Any gene that is not reperesented by the score file will be ignored.

gsrOut = gsr(annotation = 'GPL96',
                 scores = scores,
                 scoreColumn = 1,
                 iterations = 10000,
                 bigIsBetter = FALSE,
                 logTrans = TRUE)
## Attempting to download annotation file
head(gsrOut$results) %>% knitr::kable()
Name ID NumProbes NumGenes RawScore Pval CorrectedPvalue MFPvalue CorrectedMFPvalue Multifunctionality Same as GeneMembers
protein folding GO:0006457 40 24 3.198073 0.0000000 0.0000000 0.000000 0.0000000 0.145 NA AIP|CALR|CCT5|CCT6A|CCT8L2|CDC37L1|CLGN|CLPX|DNAJB1|DNAJB4|GAK|HSP90AA1|HSPA1A|HSPA9|HSPB6|HSPD1|NUDC|PFDN6|PTGES3|RUVBL2|ST13|TCP1|TOR1A|UGGT1|
unfolded protein binding GO:0051082 52 29 3.299625 0.0000000 0.0000000 0.000000 0.0000000 0.174 NA AIP|CALR|CCT5|CCT6A|CCT8L2|CDC37L1|CHAF1A|CLGN|CLPX|DNAJB1|DNAJB4|HSP90AA1|HSPA1A|HSPA9|HSPB6|HSPD1|HTRA2|NUDC|PFDN6|PTGES3|RUVBL2|SHQ1|SRSF10|TAPBP|TCP1|TOMM20|TOR1A|TUBB4B|UGGT1|
cytosol GO:0005829 74 41 2.118055 0.0005727 0.0150806 0.000400 0.0105333 0.493 NA AAMP|AIP|ARC|BHMT2|CALR|CCT5|CCT6A|CDC37L1|CLPX|CRABP1|DNAJB1|DNAJB4|EPHB2|FRS2|GAK|GEMIN2|HCK|HSP90AA1|HSPA1A|HSPB6|HSPD1|HTRA2|NELFA|NUDC|OGG1|PASK|PEX5|PIKFYVE|POLR3K|PRKCI|PTGES3|RUVBL2|SHQ1|SPHK1|SRSF10|ST13|TCP1|TOR1A|TUBB4B|UNC13B|USP33|
cellular component organization GO:0016043 67 34 2.146421 0.0024970 0.0493218 0.004267 0.0842792 0.829 NA ARC|CALR|CHAF1A|CLGN|DDX46|EPHB2|GAK|GEMIN2|HCK|HSP90AA1|HSPA1A|HSPA9|HSPD1|HTRA2|NUDC|PEX5|PFDN6|PIKFYVE|PRKCI|PTGES3|RUVBL2|SEMA3B|SHQ1|SRSF10|SULF1|TAPBP|TCP1|TOMM20|TOR1A|TUBB4B|UNC13B|USP33|VPS8|ZNF207|
cellular component organization or biogenesis GO:0071840 67 34 2.146421 0.0024970 0.0394574 0.004267 0.0674234 0.828 NA ARC|CALR|CHAF1A|CLGN|DDX46|EPHB2|GAK|GEMIN2|HCK|HSP90AA1|HSPA1A|HSPA9|HSPD1|HTRA2|NUDC|PEX5|PFDN6|PIKFYVE|PRKCI|PTGES3|RUVBL2|SEMA3B|SHQ1|SRSF10|SULF1|TAPBP|TCP1|TOMM20|TOR1A|TUBB4B|UNC13B|USP33|VPS8|ZNF207|
cellular component assembly GO:0022607 45 22 2.386422 0.0032000 0.0421333 0.004500 0.0592500 0.556 NA ARC|CALR|CHAF1A|CLGN|DDX46|EPHB2|GAK|GEMIN2|HSP90AA1|HSPA1A|HSPA9|HSPD1|PFDN6|PIKFYVE|PTGES3|RUVBL2|SHQ1|SRSF10|TAPBP|TCP1|UNC13B|ZNF207|

Use Precision Recall with gene scores

We will use the same scores file from the example above

precRecallOut = precRecall(annotation = 'GPL96',
                           scores = scores,
                           scoreColumn = 1,
                           iterations = 10000,
                           bigIsBetter = FALSE,
                           logTrans = TRUE)
## Attempting to download annotation file
head(precRecallOut$results) %>% knitr::kable()
Name ID NumProbes NumGenes RawScore Pval CorrectedPvalue MFPvalue CorrectedMFPvalue Multifunctionality Same as GeneMembers
binding GO:0005488 143 81 0.9992282 0.0000 0.0000000 0.0000 0.0000000 0.479 NA AAMP|AIP|ARC|ARF3|BHMT2|C5AR2|CACNA1F|CALR|CCNG1|CCT5|CCT6A|CCT8L2|CDC37L1|CHAF1A|CLGN|CLPX|CPT1A|CRABP1|DDX46|DMBT1|DNAJB1|DNAJB4|DZIP3|EPHB2|FBLN2|FOXB1|FPR3|FRS2|GAK|GEMIN2|GPR17|HCK|HMGCR|HSP90AA1|HSPA1A|HSPA9|HSPB6|HSPD1|HTRA2|ITIH2|KCNJ1|LIPF|MAN1B1|NELFA|NR2E3|NUDC|OGG1|PASK|PEX5|PFDN6|PIKFYVE|PLCH1|POLR3K|PPARA|PRKCI|PRPSAP1|PTGES3|RUVBL2|SEMA3B|SHOX2|SHQ1|SLC22A14|SLC24A1|SPHK1|SRSF10|ST13|SULF1|TAPBP|TBKBP1|TCP1|TNFRSF12A|TOMM20|TOR1A|TUBB4B|UGGT1|UNC13B|USP33|VPS8|YIPF2|ZCCHC8|ZNF207|
protein folding GO:0006457 40 24 0.7176581 0.0000 0.0000000 0.0000 0.0000000 0.145 NA AIP|CALR|CCT5|CCT6A|CCT8L2|CDC37L1|CLGN|CLPX|DNAJB1|DNAJB4|GAK|HSP90AA1|HSPA1A|HSPA9|HSPB6|HSPD1|NUDC|PFDN6|PTGES3|RUVBL2|ST13|TCP1|TOR1A|UGGT1|
unfolded protein binding GO:0051082 52 29 0.8507590 0.0000 0.0000000 0.0000 0.0000000 0.174 NA AIP|CALR|CCT5|CCT6A|CCT8L2|CDC37L1|CHAF1A|CLGN|CLPX|DNAJB1|DNAJB4|HSP90AA1|HSPA1A|HSPA9|HSPB6|HSPD1|HTRA2|NUDC|PFDN6|PTGES3|RUVBL2|SHQ1|SRSF10|TAPBP|TCP1|TOMM20|TOR1A|TUBB4B|UGGT1|
protein binding GO:0005515 127 73 0.9569410 0.0002 0.0039500 0.0001 0.0019750 0.267 NA AAMP|AIP|ARC|ARF3|C5AR2|CALR|CCNG1|CCT5|CCT6A|CCT8L2|CDC37L1|CHAF1A|CLGN|CLPX|CPT1A|CRABP1|DDX46|DMBT1|DNAJB1|DNAJB4|DZIP3|EPHB2|FBLN2|FOXB1|FPR3|FRS2|GAK|GEMIN2|GPR17|HCK|HMGCR|HSP90AA1|HSPA1A|HSPA9|HSPB6|HSPD1|HTRA2|ITIH2|NELFA|NR2E3|NUDC|OGG1|PASK|PEX5|PFDN6|PIKFYVE|POLR3K|PPARA|PRKCI|PRPSAP1|PTGES3|RUVBL2|SEMA3B|SHQ1|SLC22A14|SLC24A1|SPHK1|SRSF10|ST13|TAPBP|TBKBP1|TCP1|TNFRSF12A|TOMM20|TOR1A|TUBB4B|UGGT1|UNC13B|USP33|VPS8|YIPF2|ZCCHC8|ZNF207|
cytosol GO:0005829 74 41 0.7108333 0.0013 0.0205400 0.0011 0.0144833 0.493 NA AAMP|AIP|ARC|BHMT2|CALR|CCT5|CCT6A|CDC37L1|CLPX|CRABP1|DNAJB1|DNAJB4|EPHB2|FRS2|GAK|GEMIN2|HCK|HSP90AA1|HSPA1A|HSPB6|HSPD1|HTRA2|NELFA|NUDC|OGG1|PASK|PEX5|PIKFYVE|POLR3K|PRKCI|PTGES3|RUVBL2|SHQ1|SPHK1|SRSF10|ST13|TCP1|TOR1A|TUBB4B|UNC13B|USP33|
cellular anatomical entity GO:0110165 142 80 0.9860952 0.0020 0.0263333 0.0011 0.0173800 0.366 NA AAMP|AIP|ARC|ARF3|BHMT2|C5AR2|CACNA1F|CALR|CCNG1|CCT5|CCT6A|CDC37L1|CHAF1A|CLGN|CLPX|CPT1A|CRABP1|DDX46|DMBT1|DNAJB1|DNAJB4|DZIP3|EPHB2|FBLN2|FOXB1|FPR3|FRS2|GAK|GEMIN2|GPR17|HCK|HMGCR|HSP90AA1|HSPA1A|HSPA9|HSPB6|HSPD1|HTRA2|ITIH2|KCNJ1|LIPF|MAN1B1|NELFA|NR2E3|NUDC|OGG1|PASK|PEX5|PFDN6|PIKFYVE|PLCH1|POLR3K|PPARA|PRKCI|PRPSAP1|PTGES3|RUVBL2|SEMA3B|SHOX2|SHQ1|SLC22A14|SLC24A1|SPHK1|SRSF10|ST13|SULF1|TAPBP|TBKBP1|TCP1|TNFRSF12A|TOMM20|TOR1A|TUBB4B|UGGT1|UNC13B|USP33|VPS8|YIPF2|ZCCHC8|ZNF207|

Use ORA with a hitlist

library(dplyr)


# genes for GO:0051082
hitlist = c("AAMP", "AFG3L2", "AHSP", "AIP", "AIPL1", "APCS", "BBS12", 
            "CALR", "CALR3", "CANX", "CCDC115", "CCT2", "CCT3", "CCT4", "CCT5", 
            "CCT6A", "CCT6B", "CCT7", "CCT8", "CCT8L1P", "CCT8L2", "CDC37", 
            "CDC37L1", "CHAF1A", "CHAF1B", "CLGN", "CLN3", "CLPX", "CRYAA", 
            "CRYAB", "DNAJA1", "DNAJA2", "DNAJA3", "DNAJA4", "DNAJB1", "DNAJB11", 
            "DNAJB13", "DNAJB2", "DNAJB4", "DNAJB5", "DNAJB6", "DNAJB8", 
            "DNAJC4", "DZIP3", "ERLEC1", "ERO1B", "FYCO1", "GRPEL1", "GRPEL2", 
            "GRXCR2", "HEATR3", "HSP90AA1", "HSP90AA2P", "HSP90AA4P", "HSP90AA5P", 
            "HSP90AB1", "HSP90AB2P", "HSP90AB3P", "HSP90AB4P", "HSP90B1", 
            "HSP90B2P", "HSPA1A", "HSPA1B", "HSPA1L", "HSPA2", "HSPA5", "HSPA6", 
            "HSPA8", "HSPA9", "HSPB6", "HSPD1", "HSPE1", "HTRA2", "LMAN1", 
            "MDN1", "MKKS", "NAP1L4", "NDUFAF1", "NPM1", "NUDC", "NUDCD2", 
            "NUDCD3", "PDRG1", "PET100", "PFDN1", "PFDN2", "PFDN4", "PFDN5", 
            "PFDN6", "PIKFYVE", "PPIA", "PPIB", "PTGES3", "RP2", "RUVBL2", 
            "SCAP", "SCG5", "SERPINH1", "SHQ1", "SIL1", "SPG7", "SRSF10", 
            "SRSF12", "ST13", "SYVN1", "TAPBP", "TCP1", "TMEM67", "TOMM20", 
            "TOR1A", "TRAP1", "TTC1", "TUBB4B", "UGGT1", "ZFYVE21")
oraOut = ora(annotation = 'Generic_human',
             hitlist = hitlist)

head(oraOut$results) %>% knitr::kable()
Name ID NumProbes NumGenes RawScore Pval CorrectedPvalue MFPvalue CorrectedMFPvalue Multifunctionality Same as GeneMembers
unfolded protein binding GO:0051082 116 116 99 0 0 0 0 0.726 NA AFG3L2|AHSP|AIP|AIPL1|APCS|CALR|CALR3|CANX|CCAR2|CCDC115|CCT2|CCT3|CCT4|CCT5|CCT6A|CCT6B|CCT7|CCT8|CCT8L2|CDC37|CDC37L1|CHAF1A|CHAF1B|CLGN|CLPX|CLU|CRYAA|CRYAB|DNAJA1|DNAJA2|DNAJA3|DNAJA4|DNAJB1|DNAJB11|DNAJB13|DNAJB2|DNAJB3|DNAJB4|DNAJB5|DNAJB6|DNAJB7|DNAJB8|DNAJC4|ERLEC1|ERN1|ERN2|ERO1B|GRPEL1|GRPEL2|HEATR3|HSP90AA1|HSP90AB1|HSP90AB4P|HSP90B1|HSP90B2P|HSPA1A|HSPA1B|HSPA1L|HSPA2|HSPA5|HSPA6|HSPA8|HSPA9|HSPB1|HSPB2|HSPB6|HSPD1|HSPE1|HTRA2|HYOU1|LMAN1|MKKS|NACA|NACA2|NACA4P|NACAD|NAP1L4|NDUFAF1|NPM1|NUDC|NUDCD2|NUDCD3|PDRG1|PET100|PFDN1|PFDN2|PFDN4|PFDN5|PFDN6|PPIA|PPIB|PTGES3|RP2|RUVBL2|SCAP|SCG5|SERPINH1|SHQ1|SIL1|SPG7|SRSF10|SRSF12|SSUH2|SYVN1|TAPBP|TCP1|TIMM10B|TMEM67|TOMM20|TOR1A|TRAP1|TTC1|TUBB4B|UGGT1|UGGT2|VBP1|
protein folding chaperone GO:0044183 60 60 37 0 0 0 0 0.638 NA ANP32E|APLF|CALR|CALR3|CCDC47|CCT2|CCT3|CCT4|CCT5|CCT6A|CCT6B|CCT7|CCT8|CCT8L2|CD74|CLGN|CLPX|DFFA|DNAJB1|DNAJB6|DNAJB7|DNAJB8|FKBP8|HSP90AA1|HSP90AB1|HSP90AB4P|HSP90B1|HSP90B2P|HSPA13|HSPA14|HSPA1A|HSPA1B|HSPA1L|HSPA2|HSPA4|HSPA4L|HSPA5|HSPA6|HSPA7|HSPA8|HSPA9|HSPB1|HSPB6|HSPD1|HSPE1|HSPH1|HYOU1|HYPK|KHSRP|LYRM7|PDCL3|PFDN1|PFDN2|RIC3|TCP1|TOR1A|TRAP1|WDR83OS|WIPF1|ZMYND10|
chaperone-mediated protein folding GO:0061077 74 74 35 0 0 0 0 0.588 NA BAG1|CCT2|CCT3|CCT4|CCT5|CCT6A|CCT7|CCT8|CD74|CHORDC1|CLU|CRTAP|CSNK2A1|DFFA|DNAJB1|DNAJB12|DNAJB13|DNAJB14|DNAJB2|DNAJB3|DNAJB4|DNAJB5|DNAJB6|DNAJB7|DNAJB8|DNAJC18|DNAJC24|DNAJC5|DNAJC7|ERO1A|FKBP11|FKBP2|FKBP4|FKBP5|GAK|HSPA13|HSPA14|HSPA1A|HSPA1B|HSPA1L|HSPA2|HSPA5|HSPA6|HSPA7|HSPA8|HSPA9|HSPB1|HSPB6|HSPE1|HSPH1|P3H1|PDCL3|PDIA4|PEX19|PFDN1|PFDN2|PFDN4|PFDN5|PFDN6|PPIB|PPID|PTGES3|SDF2|SDF2L1|ST13|TCP1|TOR1A|TOR1B|TOR2A|TRAP1|UMOD|UNC45A|UNC45B|VBP1|
ATP-dependent protein folding chaperone GO:0140662 34 34 27 0 0 0 0 0.662 NA CCT2|CCT3|CCT4|CCT5|CCT6A|CCT6B|CCT7|CCT8|CCT8L2|CLPX|HSP90AA1|HSP90AB1|HSP90AB4P|HSP90B1|HSP90B2P|HSPA13|HSPA14|HSPA1A|HSPA1B|HSPA1L|HSPA2|HSPA4|HSPA4L|HSPA5|HSPA6|HSPA7|HSPA8|HSPA9|HSPD1|HSPH1|HYOU1|TCP1|TOR1A|TRAP1|
protein-folding chaperone binding GO:0051087 133 133 27 0 0 0 0 0.928 NA AHSA1|AHSA2P|ALB|AMFR|ATP1A1|ATP1A2|ATP1A3|ATP7A|BAG1|BAG2|BAG3|BAG4|BAG5|BAG6|BAK1|BAX|BIN1|BIRC2|BIRC5|CALR|CDC25A|CDC37|CDC37L1|CDK1|CDKN1B|CFTR|CLU|CP|CTSC|CYP1A1|CYP2E1|DNAAF6|DNAJA1|DNAJA2|DNAJA3|DNAJA4|DNAJB1|DNAJB12|DNAJB13|DNAJB14|DNAJB2|DNAJB3|DNAJB4|DNAJB5|DNAJB6|DNAJB7|DNAJB8|DNAJB9|DNAJC1|DNAJC10|DNAJC18|DNAJC2|DNAJC3|DNAJC8|DNAJC9|DNLZ|ERN1|ERP29|FGB|FGF1|FICD|FNIP1|FNIP2|GAK|GET4|GNB5|GPR37|GRN|GRPEL1|GRPEL2|HDAC8|HES1|HIKESHI|HLA-B|HSCB|HSPA2|HSPA5|HSPA8|HSPB6|HSPD1|HSPE1|IQCG|LRP2|MAPT|METTL21A|MVD|NOD2|NUP62|PACRG|PDPN|PFDN4|PFDN6|PGLYRP1|PLG|PPEF2|PPID|PRKN|PRNP|PTGES3|PTGES3L|RNF207|RPS3|SACS|SCARB2|SDF2L1|SGTB|SLC12A2|SLC25A17|SNCA|SOD1|SPN|ST13|STAU2|STUB1|SUGT1|SYVN1|TBCA|TBCC|TBCD|TBCE|TERT|TFRC|TIMM10|TIMM44|TIMM9|TP53|TSACC|TSC1|TTC4|UBL4A|USP13|VWF|WRAP53|
protein refolding GO:0042026 25 25 17 0 0 0 0 0.591 NA B2M|CRYAA|CRYAB|DNAJA1|DNAJA2|DNAJA4|DNAJB2|FKBP1A|FKBP1B|HSP90AA1|HSPA13|HSPA14|HSPA1A|HSPA1B|HSPA1L|HSPA2|HSPA5|HSPA6|HSPA7|HSPA8|HSPA9|HSPB1|HSPB2|HSPB6|HSPD1|

Using your own GO annotations

If you want to use your own GO annotations instead of getting files provided by Pavlidis Lab, you can use makeAnnotation after turning your annotations into a list. See the example below

library('org.Hs.eg.db') # get go terms from bioconductor 
goAnnots = as.list(org.Hs.egGO)
goAnnots = goAnnots %>% lapply(names)
goAnnots %>% head
## $`1`
##  [1] "GO:0008150" "GO:0005576" "GO:0005576" "GO:0005576" "GO:0005615"
##  [6] "GO:0005886" "GO:0031093" "GO:0034774" "GO:0062023" "GO:0070062"
## [11] "GO:0072562" "GO:1904813" "GO:0003674"
## 
## $`2`
##  [1] "GO:0001553" "GO:0001869" "GO:0002438" "GO:0006953" "GO:0007584"
##  [6] "GO:0010037" "GO:0034695" "GO:0048863" "GO:0051384" "GO:1990402"
## [11] "GO:0005576" "GO:0005576" "GO:0005615" "GO:0031093" "GO:0062023"
## [16] "GO:0070062" "GO:0072562" "GO:0002020" "GO:0002020" "GO:0004866"
## [21] "GO:0004866" "GO:0004867" "GO:0005102" "GO:0005515" "GO:0019838"
## [26] "GO:0019899" "GO:0019959" "GO:0019966" "GO:0042802" "GO:0043120"
## [31] "GO:0048306" "GO:0048403" "GO:0048406"
## 
## $`3`
## NULL
## 
## $`9`
## [1] "GO:0006805" "GO:0005829" "GO:0004060" "GO:0004060"
## 
## $`10`
## [1] "GO:0006805" "GO:0005829" "GO:0004060" "GO:0004060" "GO:0005515"
## 
## $`11`
## NULL

The goAnnots object we created has go terms per entrez ID. Similar lists can be obtained from other species db packages in bioconductor and some array annotation packages. We will now use the makeAnnotation function to create our annotation file. This file will have the names of this list (entrez IDs) as gene identifiers so any score or hitlist file you provide should have the entrez IDs as well.

makeAnnotation only needs the list with gene identifiers and go terms to work. But if you want to have a complete annotation file you can also provide gene symbols and gene names. Gene names have no effect on the analysis. Gene symbols matter if you are providing custom gene sets and using “Option 2” or if same genes are represented by multiple gene identifiers (eg. probes). Gene symbols will also be returned in the GeneMembers column of the output. If they are not provided, gene IDs will also be used as gene symbols

Here we’ll set them both for good measure.

geneSymbols = as.list(org.Hs.egSYMBOL) %>% unlist
geneName = as.list(org.Hs.egGENENAME) %>% unlist

annotation = makeAnnotation(goAnnots,
                            symbol = geneSymbols,
                            name = geneName,
                            output = NULL, # you can choose to save the annotation to a file
                            return = TRUE) # if you only want to save it to a file, you don't need to return

Now that we have the annotation object, we can use it to run an analysis. We’ll try to generate a hitlist only composed of genes annotated with GO:0051082.

mockHitlist = goAnnots %>% sapply(function(x){'GO:0051082' %in% x}) %>% 
    {goAnnots[.]} %>% 
    names

mockHitlist %>% head
## [1] "325"  "811"  "821"  "871"  "908"  "1047"
oraOut = ora(annotation = annotation,
             hitlist = mockHitlist)

head(oraOut$results) %>% knitr::kable()
Name ID NumProbes NumGenes RawScore Pval CorrectedPvalue MFPvalue CorrectedMFPvalue Multifunctionality Same as GeneMembers
unfolded protein binding GO:0051082 122 122 122.000 0E00 0E00 1.226E-306 5.253E-303 0.695 NA AFG3L2|AHSP|AIP|AIPL1|APCS|CALR|CALR3|CANX|CCAR2|CCDC115|CCT2|CCT3|CCT4|CCT5|CCT6A|CCT6B|CCT7|CCT8|CCT8L1P|CCT8L2|CDC37|CDC37L1|CHAF1A|CHAF1B|CLGN|CLPX|CLU|CRYAA|CRYAB|DNAJA1|DNAJA2|DNAJA3|DNAJA4|DNAJB1|DNAJB11|DNAJB13|DNAJB2|DNAJB3|DNAJB4|DNAJB5|DNAJB6|DNAJB7|DNAJB8|DNAJC4|ERLEC1|ERN1|ERN2|ERO1B|GRPEL1|GRPEL2|HEATR3|HSP90AA1|HSP90AA2P|HSP90AA4P|HSP90AA5P|HSP90AB1|HSP90AB2P|HSP90AB3P|HSP90AB4P|HSP90B1|HSP90B2P|HSPA1A|HSPA1B|HSPA1L|HSPA2|HSPA5|HSPA6|HSPA8|HSPA9|HSPB1|HSPB2|HSPB6|HSPD1|HSPE1|HTRA2|HYOU1|LMAN1|MKKS|NACA|NACA2|NACA4P|NACAD|NAP1L4|NDUFAF1|NPM1|NUDC|NUDCD2|NUDCD3|PDRG1|PET100|PFDN1|PFDN2|PFDN4|PFDN5|PFDN6|PPIA|PPIB|PTGES3|RP2|RUVBL2|SCAP|SCG5|SERPINH1|SHQ1|SIL1|SPG7|SRSF10|SRSF12|SSUH2|SYVN1|TAPBP|TCP1|TIMM10B|TMEM67|TOMM20|TOR1A|TRAP1|TTC1|TUBB4B|UGGT1|UGGT2|VBP1|
protein folding chaperone GO:0044183 69 69 47.000 2.823E-92 6.049E-89 1.565E-89 3.353E-86 0.573 NA ANP32E|APLF|CALR|CALR3|CCDC47|CCT2|CCT3|CCT4|CCT5|CCT6A|CCT6B|CCT7|CCT8|CCT8L1P|CCT8L2|CD74|CDC123|CLGN|CLPX|DFFA|DNAJB1|DNAJB6|DNAJB7|DNAJB8|FKBP8|HSP90AA1|HSP90AA2P|HSP90AA4P|HSP90AA5P|HSP90AB1|HSP90AB2P|HSP90AB3P|HSP90AB4P|HSP90B1|HSP90B2P|HSPA13|HSPA14|HSPA1A|HSPA1B|HSPA1L|HSPA2|HSPA4|HSPA4L|HSPA5|HSPA6|HSPA7|HSPA8|HSPA9|HSPB1|HSPB6|HSPD1|HSPE1|HSPH1|HYOU1|HYPK|KHSRP|LYRM7|PDCL3|PFDN1|PFDN2|RIC3|TAPBP|TCP1|TOR1A|TRAP1|WDR83OS|WIPF1|ZMYND10|ZPR1|
ATP-dependent protein folding chaperone GO:0140662 40 40 34.000 3.365E-72 4.807E-69 3.79E-69 5.413E-66 0.541 NA CCT2|CCT3|CCT4|CCT5|CCT6A|CCT6B|CCT7|CCT8|CCT8L1P|CCT8L2|CLPX|HSP90AA1|HSP90AA2P|HSP90AA4P|HSP90AA5P|HSP90AB1|HSP90AB2P|HSP90AB3P|HSP90AB4P|HSP90B1|HSP90B2P|HSPA13|HSPA14|HSPA1A|HSPA1B|HSPA1L|HSPA2|HSPA4|HSPA4L|HSPA5|HSPA6|HSPA7|HSPA8|HSPA9|HSPD1|HSPH1|HYOU1|TCP1|TOR1A|TRAP1|
chaperone-mediated protein folding GO:0061077 74 74 39.000 1.513E-69 1.621E-66 4.022E-67 4.308E-64 0.599 NA BAG1|CCT2|CCT3|CCT4|CCT5|CCT6A|CCT7|CCT8|CD74|CHORDC1|CLU|CRTAP|CSNK2A1|DFFA|DNAJB1|DNAJB12|DNAJB13|DNAJB14|DNAJB2|DNAJB3|DNAJB4|DNAJB5|DNAJB6|DNAJB7|DNAJB8|DNAJC18|DNAJC24|DNAJC5|DNAJC7|ERO1A|FKBP11|FKBP2|FKBP4|FKBP5|GAK|HSPA13|HSPA14|HSPA1A|HSPA1B|HSPA1L|HSPA2|HSPA5|HSPA6|HSPA7|HSPA8|HSPA9|HSPB1|HSPB6|HSPE1|HSPH1|P3H1|PDCL3|PDIA4|PEX19|PFDN1|PFDN2|PFDN4|PFDN5|PFDN6|PPIB|PPID|PTGES3|SDF2|SDF2L1|ST13|TCP1|TOR1A|TOR1B|TOR2A|TRAP1|UMOD|UNC45A|UNC45B|VBP1|
protein-folding chaperone binding GO:0051087 133 133 30.000 8.981E-40 7.697E-37 5.724E-38 4.906E-35 0.928 NA AHSA1|AHSA2P|ALB|AMFR|ATP1A1|ATP1A2|ATP1A3|ATP7A|BAG1|BAG2|BAG3|BAG4|BAG5|BAG6|BAK1|BAX|BIN1|BIRC2|BIRC5|CALR|CDC25A|CDC37|CDC37L1|CDK1|CDKN1B|CFTR|CLU|CP|CTSC|CYP1A1|CYP2E1|DNAAF6|DNAJA1|DNAJA2|DNAJA3|DNAJA4|DNAJB1|DNAJB12|DNAJB13|DNAJB14|DNAJB2|DNAJB3|DNAJB4|DNAJB5|DNAJB6|DNAJB7|DNAJB8|DNAJB9|DNAJC1|DNAJC10|DNAJC18|DNAJC2|DNAJC3|DNAJC8|DNAJC9|DNLZ|ERN1|ERP29|FGB|FGF1|FICD|FNIP1|FNIP2|GAK|GET4|GNB5|GPR37|GRN|GRPEL1|GRPEL2|HDAC8|HES1|HIKESHI|HLA-B|HSCB|HSPA2|HSPA5|HSPA8|HSPB6|HSPD1|HSPE1|IQCG|LRP2|MAPT|METTL21A|MVD|NOD2|NUP62|PACRG|PDPN|PFDN4|PFDN6|PGLYRP1|PLG|PPEF2|PPID|PRKN|PRNP|PTGES3|PTGES3L|RNF207|RPS3|SACS|SCARB2|SDF2L1|SGTB|SLC12A2|SLC25A17|SNCA|SOD1|SPN|ST13|STAU2|STUB1|SUGT1|SYVN1|TBCA|TBCC|TBCD|TBCE|TERT|TFRC|TIMM10|TIMM44|TIMM9|TP53|TSACC|TSC1|TTC4|UBL4A|USP13|VWF|WRAP53|
protein refolding GO:0042026 25 25 19.000 1.628E-38 1.163E-35 8.77E-36 6.263E-33 0.574 NA B2M|CRYAA|CRYAB|DNAJA1|DNAJA2|DNAJA4|DNAJB2|FKBP1A|FKBP1B|HSP90AA1|HSPA13|HSPA14|HSPA1A|HSPA1B|HSPA1L|HSPA2|HSPA5|HSPA6|HSPA7|HSPA8|HSPA9|HSPB1|HSPB2|HSPB6|HSPD1|

We can see GO:0051082 is the top scoring hit as expected.