Skip to content

Concept Whitepaper

SchSascha edited this page May 16, 2018 · 4 revisions

This document contains the conceptual background and realization plans for the GePi publication.

Backend

  • Use ElasticSearch index with new mapping structure
  • Incorporate daily updates
    • Build GePi Pipeline
    • Let run daily

Input

  • A Search
  • A-B-Search

Types of Input

  • Gene IDs
  • UniProt IDs?
  • Gene Names?

Processing

  • Mapping to top homology
  • Rather do mapping by gene name?
  • Make sure the algorithms do what they are supposed to! Write tests!

Frontend

Pie Chart

Sankey Chart, Simple

  • How to deal with enlargening the widget? When the widget is larger, more edges could be shown. But: How many? Depends on number of nodes and the size of the nodes, i.e. the abundances

Sankey Chart, Common Partners

  • Same as with simple sankey chart
  • How to rank common partner pairs, i.e. which pairs to show?
  • Currently: a and b have common partner c -> score(a,b,c) = #(a,c) + #(b,c)
    • downside is that unbalanced hits are not ranked: weighting the score with max(#(a,c) / #(b,c), #(b,c) / #(a,c)) * 2 could be an option (should be in (0,1] )

Table

  • Allow sentence filtering by key words
  • results per sentence table
    • when hit is in pmc, also deliver pubmed id
  • Provide abundance tables
    • A-Search: provide abundance of all interaction partners
    • A-B-Search:
      • show abundance of A and B members, possibly in separate columns
      • show number of different interaction partners for A and B members (potentially including median and/or mean+std)

Usage and Advantage of GePi Capabilities

Use Cases

  • Proteomics
  • Transcriptomics?
  • Pathways - potential interaction partner?

Evaluation

  • NatComm scenario against EvexDB
  • Evaluate with event-corpora