diff --git a/joss.06624/10.21105.joss.06624.crossref.xml b/joss.06624/10.21105.joss.06624.crossref.xml new file mode 100644 index 0000000000..e66165ab47 --- /dev/null +++ b/joss.06624/10.21105.joss.06624.crossref.xml @@ -0,0 +1,208 @@ + + + + 20240605114058-b4e6886f607cd1e62f17b7233a0608305e5ea561 + 20240605114058 + + JOSS Admin + admin@theoj.org + + The Open Journal + + + + + Journal of Open Source Software + JOSS + 2475-9066 + + 10.21105/joss + https://joss.theoj.org + + + + + 06 + 2024 + + + 9 + + 98 + + + + GeneScape: A Python package for gene ontology +visualization + + + + Istvan + Albert + https://orcid.org/0000-0001-8366-984X + + + + 06 + 05 + 2024 + + + 6624 + + + 10.21105/joss.06624 + + + http://creativecommons.org/licenses/by/4.0/ + http://creativecommons.org/licenses/by/4.0/ + http://creativecommons.org/licenses/by/4.0/ + + + + Software archive + 10.5281/zenodo.11245264 + + + GitHub review issue + https://github.com/openjournals/joss-reviews/issues/6624 + + + + 10.21105/joss.06624 + https://joss.theoj.org/papers/10.21105/joss.06624 + + + https://joss.theoj.org/papers/10.21105/joss.06624.pdf + + + + + + Gene ontology: Tool for the unification of +biology + Ashburner + Nature Genetics + 1 + 25 + 10.1038/75556 + 2000 + Ashburner, M., Ball, C. A., Blake, J. +A., Botstein, D., Butler, H., Cherry, J. M., Davis, A. P., Dolinski, K., +Dwight, S. S., Eppig, J. T., Harris, M. A., Hill, D. P., Issel-Tarver, +L., Kasarskis, A., Lewis, S., Matese, J. C., Richardson, J. E., +Ringwald, M., Rubin, G. M., & Sherlock, G. (2000). Gene ontology: +Tool for the unification of biology. Nature Genetics, 25(1), 25–29. +https://doi.org/10.1038/75556 + + + The Gene Ontology knowledgebase in +2023 + Consortium + Genetics + 1 + 224 + 10.1093/genetics/iyad031 + 1943-2631 + 2023 + Consortium, T. G. O., Aleksander, S. +A., Balhoff, J., Carbon, S., Cherry, J. M., Drabkin, H. J., Ebert, D., +Feuermann, M., Gaudet, P., Harris, N. L., Hill, D. P., Lee, R., Mi, H., +Moxon, S., Mungall, C. J., Muruganugan, A., Mushayahama, T., Sternberg, +P. W., Thomas, P. D., … Westerfield, M. (2023). The Gene Ontology +knowledgebase in 2023. Genetics, 224(1), iyad031. +https://doi.org/10.1093/genetics/iyad031 + + + AmiGO: online access to ontology and +annotation data + Carbon + Bioinformatics + 2 + 25 + 10.1093/bioinformatics/btn615 + 1367-4803 + 2008 + Carbon, S., Ireland, A., Mungall, C. +J., Shu, S., Marshall, B., Lewis, S., AmiGO Hub, the, & Web Presence +Working Group, the. (2008). AmiGO: online access to ontology and +annotation data. Bioinformatics, 25(2), 288–289. +https://doi.org/10.1093/bioinformatics/btn615 + + + GOATOOLS: A python library for gene ontology +analyses + Klopfenstein + Scientific reports + 1 + 8 + 10.1038/s41598-018-28948-z + 2018 + Klopfenstein, D., Zhang, L., +Pedersen, B. S., Ramı́rez, F., Warwick Vesztrocy, A., Naldi, A., Mungall, +C. J., Yunes, J. M., Botvinnik, O., Weigel, M., & others. (2018). +GOATOOLS: A python library for gene ontology analyses. Scientific +Reports, 8(1), 1–17. +https://doi.org/10.1038/s41598-018-28948-z + + + topGO: Enrichment analysis for gene +ontology + Alexa + Bioconductor + 10.18129/B9.bioc.topGO + 2023 + Alexa, A., & Rahnenfuhrer, J. +(2023). topGO: Enrichment analysis for gene ontology. In Bioconductor. +Bioconductor. +https://doi.org/10.18129/B9.bioc.topGO + + + QuickGO: a web-based tool for Gene Ontology +searching + Binns + Bioinformatics + 22 + 25 + 10.1093/bioinformatics/btp536 + 1367-4803 + 2009 + Binns, D., Dimmer, E., Huntley, R., +Barrell, D., O’Donovan, C., & Apweiler, R. (2009). QuickGO: a +web-based tool for Gene Ontology searching. Bioinformatics, 25(22), +3045–3046. +https://doi.org/10.1093/bioinformatics/btp536 + + + Exploring network structure, dynamics, and +function using NetworkX + Hagberg + Proceedings of the 7th python in science +conference + 2008 + Hagberg, A. A., Schult, D. A., & +Swart, P. J. (2008). Exploring network structure, dynamics, and function +using NetworkX. In G. Varoquaux, T. Vaught, & J. Millman (Eds.), +Proceedings of the 7th python in science conference (pp. +11–15). + + + Shiny: Web application framework for +r + Chang + 2024 + Chang, W., Cheng, J., Allaire, J., +Sievert, C., Schloerke, B., Xie, Y., Allen, J., McPherson, J., Dipert, +A., & Borges, B. (2024). Shiny: Web application framework for r. +https://shiny.posit.co/ + + + + + + diff --git a/joss.06624/10.21105.joss.06624.pdf b/joss.06624/10.21105.joss.06624.pdf new file mode 100644 index 0000000000..112ab9a93b Binary files /dev/null and b/joss.06624/10.21105.joss.06624.pdf differ diff --git a/joss.06624/paper.jats/10.21105.joss.06624.jats b/joss.06624/paper.jats/10.21105.joss.06624.jats new file mode 100644 index 0000000000..2733f4dfca --- /dev/null +++ b/joss.06624/paper.jats/10.21105.joss.06624.jats @@ -0,0 +1,595 @@ + + +
+ + + + +Journal of Open Source Software +JOSS + +2475-9066 + +Open Journals + + + +6624 +10.21105/joss.06624 + +GeneScape: A Python package for gene ontology +visualization + + + +https://orcid.org/0000-0001-8366-984X + +Albert +Istvan + + + + + + +Bioinformatics Consulting Center, Pennsylvania State +University, United States of America + + + + +Department of Biochemistry and Molecular Biology, +Pennsylvania State University, United States of America + + + + +30 +3 +2024 + +9 +98 +6624 + +Authors of papers retain copyright and release the +work under a Creative Commons Attribution 4.0 International License (CC +BY 4.0) +2022 +The article authors + +Authors of papers retain copyright and release the work under +a Creative Commons Attribution 4.0 International License (CC BY +4.0) + + + +Python +biology +bioinformatics +functional analysis + + + + + + Summary +

The Gene Ontology (GO) + (Ashburner + et al., 2000; + Consortium + et al., 2023) is a structured vocabulary that describes gene + products in the context of their associated functions. The ontology + takes the form of a directed graph, where each node defines a term, + and each edge represents a hierarchical relationship between the terms + (the words of the vocabulary).

+

For example, in the GO data, GO:0090630 + defines activation of GTPase activity and is a child + of GO:0043547, defined as positive + regulation of GTPase activity which in turn is a child of + GO:0051345 representing positive + regulation of hydrolase activity.

+

Gene association files (GAF) are text files used to annotate an + organism’s gene products with Gene Ontology terms, associating + functions to gene products. For example, a GAF file connects a gene + product label, such as ZC3H11B, with multiple + GO terms, such as GO:0046872 or + GO:0016973. The complete human genome GAF + representation contains 288,575 associations of 19,606 gene symbols + with over 18,680 GO terms.

+

The + Gene + Ontology Consortium maintains GAF files for various + organisms. Typical genomic analysis protocols generate gene lists that + must be placed in a functional context.

+
+ + Statement of need +

The most annotated gene in the human genome, + HTT, currently has 1100 annotations. Thus, even + small lists of genes may have a large number of annotations presenting + an extraordinary challenge for interpretation. There is a clear need + to visualize shared gene functions in an informative manner.

+

Web-based tools designed to visualize and filter gene ontology data + include AmiGO + (Carbon + et al., 2008) and QuickGO + (Binns + et al., 2009). Command line tools like + goatools + (Klopfenstein + et al., 2018) support GO term lineage visualization. R packages + like topGO + (Alexa + & Rahnenfuhrer, 2023) implement GO structure visualizations + of enriched GO terms. We are unaware of locally installable software + that allows for interactive filtering and visualization of gene + ontology terms derived from gene lists.

+

GeneScape is a Python package that allows users to visualize a list + of genes in the functional context represented by the Gene + Ontology

+

GeneScape is distributed both as a command-line tool and as + GUI-enabled standalone software via the + Shiny + platform + (Chang + et al., 2024), making it accessible to a wide range of + users.

+ +

GeneScape as a Shiny App +

+ +
+

GeneScape is distributed with several prebuilt databases for model + organisms including the human, mouse, rat, fruitfly and zebrafish + genomes. To study additional organisms, users must download GAF files + from the Gene Ontology website and create custom databases using the + build subcommand:

+ genescape build --gaf mydata.gaf.gz --index mydata.index.gz +

For detailed instructions on using the software, users should refer + to the + GeneScape + documentation. A Q&A discussion board is also available + on the GeneScape GitHub page.

+ + Typical usage +

A typical usage starts with a gene list such as:

+ ABTB3 +BCAS4 +C3P1 +GRTP1 +

Users can process the list above via the command line or the + Shiny interface. A command line invocation might look like:

+ genescape tree genes1.txt -o output.pdf +

The command above will produce the image:

+ +

Ontology subgraph for a gene list +

+ +
+

Internally, GeneScape first transforms the input gene list into a + GO term list, where additional information is added to each + term:

+ Coverage,Function,Domain,GO,Genes +1,endopeptidase inhibitor activity,MF,GO:0004866,C3P1 +1,GTPase activator activity,MF,GO:0005096,GRTP1 +1,extracellular space,CC,GO:0005615,C3P1 +1,cytoplasm,CC,GO:0005737,BCAS4 +1,membrane,CC,GO:0016020,ABTB3 +1,PDZ domain binding,MF,GO:0030165,ABTB3 +1,BLOC-1 complex,CC,GO:0031083,BCAS4 +1,"synaptic transmission, glutamatergic",BP,GO:0035249,ABTB3 +1,exploration behavior,BP,GO:0035640,ABTB3 +1,protein heterodimerization activity,MF,GO:0046982,ABTB3 +1,protein stabilization,BP,GO:0050821,ABTB3 +1,activation of GTPase activity,BP,GO:0090630,GRTP1 +1,glutamatergic synapse,CC,GO:0098978,ABTB3 +

In the next step, GeneScape draws the GO terms as the graph + structure using the Networkx package + (Hagberg + et al., 2008), helping users visualize the functional context + of the genes relative to the larger Gene Ontology.

+

Various colors and labels are used to provide additional context + to the nodes in the graph; for example, functions present in the + input genes are colored green. Intermediate nodes are colored by + their category. Node labels display the total annotations and the + number of genes that carry that function.

+ +

Filtering a large graph for a specific term +

+ +
+

In the web interface, users can zoom in and out of the tree. The + software’s command-line version supports generating outputs in + various formats, such as PDF or PNG.

+

Since the resulting graphs may also be large, with thousands of + nodes, the main interface provides input widgets that allow users to + interactively reduce the subgraph to nodes for which:

+ + +

The function definitions match certain patterns.

+
+ +

A minimum number of genes share a function.

+
+ +

Nodes belong to a specific GO subtree: Biological Process + (BP), Molecular Function (MF), Cellular Component (CC).

+
+
+

As an example, take the input gene list of just four genes:

+ Cyp1a1 +Sphk2 +Sptlc2 +Smpd3 +

the resulting functional ontology graph is large with 641 nodes + and 1,007 edges:

+ +

Very few genes can produce a large ontology tree +

+ +
+

Users can reduce the tree to show only terms that match the word + lipid and with at least two genes supporting + the function via the graphical user interface or the command + line:

+ genescape tree -m lipid --mincov 2 genes2.txt -o output.pdf +

The filtering process will result in a smaller tree with 18 nodes + and 29 edges, focused on the functions that contain the word + “lipid”:

+ +

Filtering a large graph for a specific term +

+ +
+

The software’s primary purpose is to allow users to assess the + functional depth of genes and identify commonalities and differences + in the functional context of these genes.

+
+
+ + Acknowledgments +

We acknowledge support from the Huck Institutes for the Life + Sciences at the Pennsylvania State University.

+
+ + + + + + + + AshburnerMichael + BallCatherine A. + BlakeJudith A. + BotsteinDavid + ButlerHeather + CherryJ. Michael + DavisAllan P. + DolinskiKara + DwightSelina S. + EppigJanan T. + HarrisMidori A. + HillDavid P. + Issel-TarverLaurie + KasarskisAndrew + LewisSuzanna + MateseJohn C. + RichardsonJoel E. + RingwaldMartin + RubinGerald M. + SherlockGavin + + Gene ontology: Tool for the unification of biology + Nature Genetics + 2000 + 25 + 1 + https://doi.org/10.1038/75556 + 10.1038/75556 + 25 + 29 + + + + + + ConsortiumThe Gene Ontology + AleksanderSuzi A + BalhoffJames + CarbonSeth + CherryJ Michael + DrabkinHarold J + EbertDustin + FeuermannMarc + GaudetPascale + HarrisNomi L + HillDavid P + LeeRaymond + MiHuaiyu + MoxonSierra + MungallChristopher J + MuruganuganAnushya + MushayahamaTremayne + SternbergPaul W + ThomasPaul D + Van AukenKimberly + RamseyJolene + SiegeleDeborah A + ChisholmRex L + FeyPetra + AspromonteMaria Cristina + NugnesMaria Victoria + QuagliaFederica + TosattoSilvio + GiglioMichelle + NadendlaSuvarna + AntonazzoGiulia + AttrillHelen + SantosGil dos + MarygoldSteven + StreletsVictor + TaboneChristopher J + ThurmondJim + ZhouPinglei + AhmedSaadullah H + AsanitthongPraoparn + Luna BuitragoDiana + ErdolMeltem N + GageMatthew C + Ali KadhumMohamed + LiKan Yan Chloe + LongMiao + MichalakAleksandra + PesalaAngeline + PritazahraArmalya + SaverimuttuShirin C C + SuRenzhi + ThurlowKate E + LoveringRuth C + LogieColin + OliferenkoSnezhana + BlakeJudith + ChristieKaren + CorbaniLori + DolanMary E + DrabkinHarold J + HillDavid P + NiLi + SitnikovDmitry + SmithCynthia + CuzickAlayne + SeagerJames + CooperLaurel + ElserJustin + JaiswalPankaj + GuptaParul + JaiswalPankaj + NaithaniSushma + Lera-RamirezManuel + RutherfordKim + WoodValerie + De PonsJeffrey L + DwinellMelinda R + HaymanG Thomas + KaldunskiMary L + KwitekAnne E + LaulederkindStanley J F + TutajMarek A + VediMahima + WangShur-Jen + D’EustachioPeter + AimoLucila + AxelsenKristian + BridgeAlan + Hyka-NouspikelNevila + MorgatAnne + AleksanderSuzi A + CherryJ Michael + EngelStacia R + KarraKalpana + MiyasatoStuart R + NashRobert S + SkrzypekMarek S + WengShuai + WongEdith D + BakkerErika + BerardiniTanya Z + ReiserLeonore + AuchinclossAndrea + AxelsenKristian + Argoud-PuyGhislaine + BlatterMarie-Claude + BoutetEmmanuel + BreuzaLionel + BridgeAlan + Casals-CasasCristina + CoudertElisabeth + EstreicherAnne + Livia FamigliettiMaria + FeuermannMarc + GosArnaud + Gruaz-GumowskiNadine + HuloChantal + Hyka-NouspikelNevila + JungoFlorence + Le MercierPhilippe + LieberherrDamien + MassonPatrick + MorgatAnne + PedruzziIvo + PourcelLucille + PouxSylvain + RivoireCatherine + SundaramShyamala + BatemanAlex + Bowler-BarnettEmily + Bye-A-JeeHema + DennyPaul + IgnatchenkoAlexandr + IshtiaqRizwan + LockAntonia + LussiYvonne + MagraneMichele + MartinMaria J + OrchardSandra + RaposoPedro + SperettaElena + TyagiNidhi + WarnerKate + ZaruRossana + DiehlAlexander D + LeeRaymond + ChanJuancarlos + DiamantakisStavros + RacitiDaniela + ZarowieckiMagdalena + FisherMalcolm + James-ZornChristina + PonferradaVirgilio + ZornAaron + RamachandranSridhar + RuzickaLeyla + WesterfieldMonte + + The Gene Ontology knowledgebase in 2023 + Genetics + 202303 + 224 + 1 + 1943-2631 + https://doi.org/10.1093/genetics/iyad031 + 10.1093/genetics/iyad031 + iyad031 + + + + + + + CarbonSeth + IrelandAmelia + MungallChristopher J. + ShuShengQiang + MarshallBrad + LewisSuzanna + AmiGO Hub + Web Presence Working Group + + AmiGO: online access to ontology and annotation data + Bioinformatics + 200811 + 25 + 2 + 1367-4803 + https://doi.org/10.1093/bioinformatics/btn615 + 10.1093/bioinformatics/btn615 + 288 + 289 + + + + + + KlopfensteinDV + ZhangLiangsheng + PedersenBrent S + Ramı́rezFidel + Warwick VesztrocyAlex + NaldiAurélien + MungallChristopher J + YunesJeffrey M + BotvinnikOlga + WeigelMark + others + + GOATOOLS: A python library for gene ontology analyses + Scientific reports + Nature Publishing Group + 2018 + 8 + 1 + 10.1038/s41598-018-28948-z + 1 + 17 + + + + + + AlexaA + RahnenfuhrerJ + + topGO: Enrichment analysis for gene ontology + Bioconductor + Bioconductor + 2023 + https://bioconductor.org/packages/topGO + 10.18129/B9.bioc.topGO + + + + + + BinnsDavid + DimmerEmily + HuntleyRachael + BarrellDaniel + O’DonovanClaire + ApweilerRolf + + QuickGO: a web-based tool for Gene Ontology searching + Bioinformatics + 200909 + 25 + 22 + 1367-4803 + https://doi.org/10.1093/bioinformatics/btp536 + 10.1093/bioinformatics/btp536 + 3045 + 3046 + + + + + + HagbergAric A. + SchultDaniel A. + SwartPieter J. + + Exploring network structure, dynamics, and function using NetworkX + Proceedings of the 7th python in science conference + + VaroquauxGaël + VaughtTravis + MillmanJarrod + + Pasadena, CA USA + 2008 + 11 + 15 + + + + + + ChangWinston + ChengJoe + AllaireJJ + SievertCarson + SchloerkeBarret + XieYihui + AllenJeff + McPhersonJonathan + DipertAlan + BorgesBarbara + + Shiny: Web application framework for r + 2024 + https://shiny.posit.co/ + + + + +
diff --git a/joss.06624/paper.jats/gs_output_1.png b/joss.06624/paper.jats/gs_output_1.png new file mode 100644 index 0000000000..14ee22e16c Binary files /dev/null and b/joss.06624/paper.jats/gs_output_1.png differ diff --git a/joss.06624/paper.jats/gs_output_2.png b/joss.06624/paper.jats/gs_output_2.png new file mode 100644 index 0000000000..60c18d0b1c Binary files /dev/null and b/joss.06624/paper.jats/gs_output_2.png differ diff --git a/joss.06624/paper.jats/gs_output_3.png b/joss.06624/paper.jats/gs_output_3.png new file mode 100644 index 0000000000..2dbb332d8b Binary files /dev/null and b/joss.06624/paper.jats/gs_output_3.png differ diff --git a/joss.06624/paper.jats/gs_web_interface.png b/joss.06624/paper.jats/gs_web_interface.png new file mode 100644 index 0000000000..f417819616 Binary files /dev/null and b/joss.06624/paper.jats/gs_web_interface.png differ diff --git a/joss.06624/paper.jats/node_help_1.png b/joss.06624/paper.jats/node_help_1.png new file mode 100644 index 0000000000..c80461e712 Binary files /dev/null and b/joss.06624/paper.jats/node_help_1.png differ