The Gene Ontology (GO)
+ (
For example, in the GO data,
Gene association files (GAF) are text files used to annotate an
+ organism’s gene products with Gene Ontology terms, associating
+ functions to gene products. For example, a GAF file connects a gene
+ product label, such as
The
+
The most annotated gene in the human genome,
+
Web-based tools designed to visualize and filter gene ontology data
+ include
GeneScape is a Python package that allows users to visualize a list + of genes in the functional context represented by the Gene + Ontology
+GeneScape is distributed both as a command-line tool and as
+ GUI-enabled standalone software via the
+
GeneScape as a Shiny App
+
GeneScape is distributed with several prebuilt databases for model
+ organisms including the human, mouse, rat, fruitfly and zebrafish
+ genomes. To study additional organisms, users must download GAF files
+ from the Gene Ontology website and create custom databases using the
+
For detailed instructions on using the software, users should refer
+ to the
+
A typical usage starts with a gene list such as:
+Users can process the list above via the command line or the + Shiny interface. A command line invocation might look like:
+The command above will produce the image:
+Ontology subgraph for a gene list
+
Internally, GeneScape first transforms the input gene list into a + GO term list, where additional information is added to each + term:
+In the next step, GeneScape draws the GO terms as the graph
+ structure using the Networkx package
+ (
Various colors and labels are used to provide additional context + to the nodes in the graph; for example, functions present in the + input genes are colored green. Intermediate nodes are colored by + their category. Node labels display the total annotations and the + number of genes that carry that function.
+Filtering a large graph for a specific term
+
In the web interface, users can zoom in and out of the tree. The + software’s command-line version supports generating outputs in + various formats, such as PDF or PNG.
+Since the resulting graphs may also be large, with thousands of + nodes, the main interface provides input widgets that allow users to + interactively reduce the subgraph to nodes for which:
+-
+
The function definitions match certain patterns.
+A minimum number of genes share a function.
+Nodes belong to a specific GO subtree: Biological Process + (BP), Molecular Function (MF), Cellular Component (CC).
+As an example, take the input gene list of just four genes:
+the resulting functional ontology graph is large with 641 nodes + and 1,007 edges:
+Very few genes can produce a large ontology tree
+
Users can reduce the tree to show only terms that match the word
+
The filtering process will result in a smaller tree with 18 nodes + and 29 edges, focused on the functions that contain the word + “lipid”:
+Filtering a large graph for a specific term
+
The software’s primary purpose is to allow users to assess the + functional depth of genes and identify commonalities and differences + in the functional context of these genes.
+We acknowledge support from the Huck Institutes for the Life + Sciences at the Pennsylvania State University.
+