Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unique selector bytes for History, Beacon, and State networks #233

Merged
merged 1 commit into from
Nov 27, 2023

Conversation

mrferris
Copy link
Contributor

@mrferris mrferris commented Oct 2, 2023

Content keys on different portal networks can and do have overlapping selectors. For example the first byte of a history header content key is 0x00 and the first byte of a beacon bootstrap content key is 0x00.

I'm proposing that, as convention, we un-overlap these selectors for the sake of clarity. I'm not proposing that we make it required that these not overlap, which would put a limit on the number of auxillary networks (eg ethstorage) that can be built in the future.

The motivation for doing this is the ambiguity that it removes. With overlapping selectors a given content key needs to always be accompanied by information about which network it belongs to, otherwise it means nothing to the parser/reader. I think that this ambiguity will become a frustration while developing/debugging/discussing multi-subnet portal clients and the software that uses them.

History network keeps its selectors, so not much live data would need to be dumped to do this. Only beacon which is young enough for this to be a good time to make this change if we choose to.

@pipermerriam
Copy link
Member

This is good to merge as-is...

When Jason and I chatted about this we talked about making the selector LEB128 encoded, which I believe changes nothing until we hit a selector value >=128. We could then segment off the range between 0-127 by nibbles which would let us do:

  • 0x00-0x0f: history
  • 0x10-0x1f: beacon
  • 0x20-0x2f: state
  • etc

This has the nice property of being able to have the high nibbles tell us which network a key belongs to and giving us 8 blocks of keys to dole out before we have to expand the LEB encoded value to a 2-byte length value.

@@ -155,7 +155,7 @@ HISTORICAL_ROOTS_LIMIT = 2**24 # = 16,777,216

```
light_client_bootstrap_key = Container(block_hash: Bytes32)
selector = 0x00
selector = 0x03
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating this to 0x10 (and 0x11, 0x12, etc) would give us a nice namespace and give us room to expand the history network set of keys without awkward gaps.

state-network.md Outdated
@@ -140,7 +140,7 @@ A leaf node from the main account trie and accompanying merkle proof against a r

```
account_trie_proof_key := Container(address: Bytes20, state_root: Bytes32)
selector := 0x00
selector := 0x08
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, 0x20, 0x21 and so on would give us some nice namespacing.

Copy link
Collaborator

@kdeme kdeme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd like to understand a bit better (in the form of examples?) where/when this is deemed needed or useful?
Wouldn't the content always be together with its network context?

e.g. Any of the portal JSON RPC APIs is already per network prefix separated. The networks themselves are separated.

I guess test vectors could be parsed and tested in a more modular sense with this change? But that is also achievable by adding little network metadata to them (besides that you just might want to keep it seperate per network anyhow)

Side note: It also breaks compatibility with using these Prefixes as SSZ Union. And yes, the SSZ Union is not explicitly specified in the specs anymore, but Fluffy does still use it as it originally was in the spec, considering it is the same currently.
I'm not fully bound to it and it is not a huge change, but if I don't see a real benefit, I can't say I'm pro for it either :)

@carver
Copy link
Contributor

carver commented Oct 5, 2023

I think I'd like to understand a bit better (in the form of examples?) where/when this is deemed needed or useful?
Wouldn't the content always be together with its network context?

I see this as most useful when the content keys start to escape the network. Like this link:
http://glados.ethportal.net/content/key/0x002d5460dbc4dfe734c7f00667e0cd678767c35f8ec8b08f033c2a299879bdb333/

Using the current url, there is no reliable way to determine what kind of content is at that key. In fact, nothing is stopping keys like this from being duplicated on multiple networks.

Modifying the ID byte could solve this problem for us.


There are other solutions for this particular scenario, of course. The URL format could be updated to include the network in the path, for example. More broadly, any app anywhere that wants to tell you something about a content key will always have to ask you: what is the key, and also which network did it come from?


So it seems like a simple solution to modify the byte in order to can clean up 3rd party interfaces that accept content keys, without requiring a second input.

Sure, this is a nice-to-have, but it's also a nearly-impossible-to-change-later, and a not-too-hard-to-change-now. :)
(though I admit ignorance into how hard it is for you to make the SSZ Union change in Fluffy)

@mrferris mrferris force-pushed the sequential-selectors branch from b4713bb to 1ec1c8d Compare November 3, 2023 21:25
@mrferris mrferris requested a review from kdeme November 3, 2023 21:30
Copy link
Contributor

@ScottyPoi ScottyPoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 from me.

Will open a PR on Ultralight and merge it when this is final

@kdeme
Copy link
Collaborator

kdeme commented Nov 17, 2023

I personally still don't see that much use for this change. But I'm not specifically against it either, so if everyone else agrees with this, good to go for me too.

Sure, this is a nice-to-have, but it's also a nearly-impossible-to-change-later, and a not-too-hard-to-change-now. :)

This is probably the most compelling reason I've heard :-)

@pipermerriam
Copy link
Member

I'm a strong +1 that we should do this.

@mrferris mrferris merged commit 72327da into master Nov 27, 2023
3 checks passed
@pipermerriam
Copy link
Member

I think there's a follow up PR that needs to be done to update the main spec documents to denote which networks use which namespaces.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants