Storing large SSZ objects in Portal #301

pipermerriam · 2024-05-16T20:48:37Z

Cross posting this here so it's easy to find.

https://ethresear.ch/t/distributed-storage-and-cryptographically-secured-retrieval-of-ssz-objects-for-portal-network/19575

This likely has applications in

storing contiguous sections of history allowing for both selective and bulk retrieval.
storing the full beacon state
efficient archival storage of the main Ethereum State Efficiency trade-offs in state network archive storage approaches #288 & Schemes for storing the state #194

pipermerriam · 2024-11-18T21:31:28Z

Ok, I've got a thread that is maybe at least worth pulling on.

Suppose we have this object:

container BlockType1:
    ...

container BlockType2:
    ....

container History:
    fork_1_blocks: List[BlockType1, ...]
    fork_2_blocks: List[BlockType2, ...]

Lets suppose that the serialized version of this object takes 2GB for the data in the 1st list, and 6GB for the data in the second list. In the original scheme, we SSZ encode it and then re-hash it using a ByteList because we need to even out the spread of the data. All of the data needs to provably part of the whole and all data needs to be provably in the correct location in the network. The ByteList solves this.

Now, suppose that we use the normal SSZ hash, but along with that hash, we also include some metadata about the sizes. Specifically, a sequence of paths into the merkle trie of the SSZ hash and the corresponding size of the variable sized values between each path.

# History.fork_1_blocks merkle trie range
0x0 - 0xdeadbeef -> 2GB

# History.fork_2_blocks merkle trie range
0xdeadbeef - 0xffff -> 6GB

I think that with this simple addition of meta-data that tells us the amount of serialized data that should be expected within a given range of the merkle trie, it satisfies the necessary network conditions that we previously needed the ByteList for. Assuming we were mapping this data onto the full DHT address space, we would do a proportional map of the 0x0 - 0xdeadbeef onto the first 1/4th of the address space, and then map 0xdeadbeef - 0xfff onto the later 3/4ths of the address space.

Some caveats to this are that the underlying data may not actually be evenly distributed, such as some blocks having more transactions than others... which would result in some un-even-ness but my intuition says not enough to cause a problem. And since we're talking about about objects that we would be committing to ahead of time, we could actually do the work to determine what level of granularity we would need to provide for the merkle paths to achieve the necessary data distribution.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Storing large SSZ objects in Portal #301

Storing large SSZ objects in Portal #301

pipermerriam commented May 16, 2024 •

edited

Loading

pipermerriam commented Nov 18, 2024 •

edited

Loading

Storing large SSZ objects in Portal #301

Storing large SSZ objects in Portal #301

Comments

pipermerriam commented May 16, 2024 • edited Loading

pipermerriam commented Nov 18, 2024 • edited Loading

pipermerriam commented May 16, 2024 •

edited

Loading

pipermerriam commented Nov 18, 2024 •

edited

Loading