-
Notifications
You must be signed in to change notification settings - Fork 11
Large Static Graphs
Yogesh Simmhan edited this page Jul 28, 2016
·
7 revisions
- Presented by Diptanshu
- V props, E props and topo kept in separate files.
- Partitions can be loaded from disk to memory on the fly, and written back to disk.
- If property changes and not topo, only deltas of property loaded from disk
- Sticky partitions for random set of partitions to be retained in memory. LRU to keep recently used partitions in memory.
- Messages batched and written to disk in vertex ID order. Multiple batches can be in different files.
- Compute method called on vertices in same order. Cursor for each file moves sequentially forward, pointing to current/newer vertex being processed. Ensures single pass thru file.
- Presented by Diptanshu
- Each subgraph fits in mem, but not entire graph in distributed mem
- If subgraph in memory, updated async in memory. Else write to disk. Messages read when loading that subgraph from disk.
- Allows subgraphs to make async progress for k supersteps.
- Inactive vertices need not be loaded into memory for a given superstep after being persisted to disk.
- TODO
- See CCGRID 2016 paper by Safi
- Look at KLA programming model
- Presented by Diptanshu
- Edge owned only by single vertex. Undirected means two edges.
- Direct mutation on edges owned by a vertex: Add, remove, convert from directed to undirected. Visible in same superstep.
- Graph mutation requests for vertices/edges not owned by a vertex (add/remove vertex/edge, optionally with properties). Mutations are applied between two ssteps.
- sending message to no-existed vertex can create it.
- Mutation conflicts resolved using vertex change and vertex resolver.
- TODO: What is the specific order in which mutations happen?
- Outedges can have custom implementations. Iterable also accessible thru different models, like ArrayList, HashMapEdgeList, ByteArrayList, etc.