Skip to content
Diptanshu Kakwani edited this page Jul 16, 2016 · 9 revisions

Welcome to the goffish_v3 wiki!

Broad classification of required components:

  1. Programming Model
    Specifies the programming abstraction provided by the framework using which applications can be written. Popular choices of programming model includes vertex-centric and subgraph-centric.
  2. Runtime Model
    Responsible for dynamic operations which can be performed during the run time.
  3. Storage Model
    Deals with efficient storage and retrieval of various types of graphs.
  4. Admin Model
    Responsible for admin tasks including logging, setup, UI etc.

Specific requirements:

  • Support for time series graphs and dynamic graphs [1][2]
    In addition to the static graphs, time series and dynamic graphs should be natively supported by the framework. Should be able to process multiple graphs at the same time.
  • Dynamic arrival of graphs [2]
    Process streaming graph data.
  • Subgraph centric programming model [1]
    Programming model which should be exposed to the programmers to use the framework.
  • Repartitioning on the fly [2]
    Repartitioning during run time to support efficient processing of graph.
  • Modify topology [1][2]
    Insertion, deletion and changing edge direction in the graph structure.
  • Updates to storage [3]
    Update persistent stored data on the file system.
  • Operations on meta-graph [1][2]
    Processing meta-graph instead of the whole graph.
  • Transaction jobs and long running jobs [1][2]
    Support for both type of processing: batch and real time.
  • Scale out and in the storage [3]
    Native support for horizontal scaling of storage.
  • Analyzing packing and reliability [2]
    Efficient packing of messages and reliably delivering them.
  • Worker tasks allocation and scheduling [2]
    Allocation of tasks to worker nodes and scheduling tasks.
  • Dynamic mapping of workers to VMs [2]
    Dynamic mapping to efficiently use resources and reduce execution time.
  • Composition of phases, tasks etc [1][2]
    Applications can be designed as collection of phases which can have specific order and resource requirements. Generalization of master-compute, BSP, reduce, init and conclusion.
  • Flexible graph API [1]
    Flexible graph API to support various graph data structure as per the application requirements.
  • Logging, setup, UI, reliability [4]
    Admin tasks to ease development of applications.
  • Write out non-graph data to file system [3]
    Write results to the file system.
  • Pre-computation stats [1][2]
    Stats to gain insights about the graph and perform dynamic operations.
  • Check points [2][3]
    Check points to restart and migrate graph data and worker tasks to help in recovery in case of failures.
  • Different types of input, application and architecture [1][2]
    For applications to leverage the 3-dimensions: input, application and architecture.
Clone this wiki locally