diff --git a/implementation-details-overlay.md b/implementation-details-overlay.md index dc252e1..adc8255 100644 --- a/implementation-details-overlay.md +++ b/implementation-details-overlay.md @@ -198,7 +198,7 @@ Each bucket is limited to `K` total members ### D.3.d - Replacement cache -Each bucket maintains a set of additional nodes known to be at the appropriate distance. When a node is removed from the routing table it is replaced by a node from the replacement cache when one is available. The cache is managed such that it remains disjoint from the nodes in the corresponding bucket. +Each bucket maintains a set of additional nodes known to be at the appropriate distance. When a node is removed from the routing table it is replaced by a node from the replacement cache when one is available. The cache is managed such that it remains disjoint from the nodes in the corresponding bucket. ## D.4 - Retrieve nodes at specified log-distance @@ -218,7 +218,7 @@ The client uses a set of bootnodes to acquire an initial view of the network. ### E.1.a - Bootnodes -Each supported sub protocol can have its own set of bootnodes. These records can be either hard coded into the client or provided via client configuration. +Each supported sub protocol can have its own set of bootnodes. These records can be either hard coded into the client or provided via client configuration. ## E.2 - Population of routing table @@ -226,7 +226,7 @@ The client actively seeks to populate its routing table by performing [RFN](#TOD ## E.3 - Liveliness checks -The client tracks *liveliness* of nodes in its routing table and periodically checks the liveliness of the node in its routing table which was least recently checked. +The client tracks _liveliness_ of nodes in its routing table and periodically checks the liveliness of the node in its routing table which was least recently checked. ### E.3.a - Rate Limiting Liviliness Checks @@ -238,7 +238,7 @@ Management of stored content. ## F.1 - Content can be stored -Content can be stored in a persistent database. Databases are segmented by sub protocol. +Content can be stored in a persistent database. Databases are segmented by sub protocol. ## F.2 - Content can be retrieved by `content_id` @@ -248,12 +248,10 @@ Given a known `content_id` the corresponding content payload can be retrieved. Content can be removed. - ## F.4 - Query furthest by distance Retrieval of the content from the database which is furthest from a provided `node_id` using the custom distance function. - ## F.5 - Total size of stored content Retrieval of the total number of bytes stored. @@ -274,7 +272,7 @@ The ability to listening for an inbound connection from another node with a `con ## G.2 - Enforcement of maximum stored content size -When the total size of stored content exceeds the configured maximum content storage size the content which is furthest from the local `node_id` is evicted in a timely manner. This should also result in any "data radius" values relevant to this network being adjusted. +When the total size of stored content exceeds the configured maximum content storage size the content which is furthest from the local `node_id` is evicted in a timely manner. This should also result in any "data radius" values relevant to this network being adjusted. ## G.3 - Retrieval via FINDCONTENT/FOUNDCONTENT & uTP @@ -298,13 +296,13 @@ Support for receipt of content using the OFFER/ACCEPT messages and uTP sub proto ### G.4.a - Handle incoming gossip -Client can listen for incoming OFFER messages, responding with an ACCEPT message for any offered content which is of interest to the client. +Client can listen for incoming OFFER messages, responding with an ACCEPT message for any offered content which is of interest to the client. #### G.4.a.1 - Receipt via uTP After sending an ACCEPT response to an OFFER request the client listens for an inbound uTP stream with the `connection-id` that was sent with the ACCEPT response. -### G.4.b - Neighborhood Gossip Propogation +### G.4.b - Neighborhood Gossip Propagation Upon receiving and validating gossip content, the content should then be gossiped to some set of interested nearby peers. @@ -312,14 +310,12 @@ Upon receiving and validating gossip content, the content should then be gossipe Upon receiving an ACCEPT message in response to our own OFFER message the client can initiate a uTP stream with the other node and can send the content payload across the stream. - ## G.5 - Serving Content The client should listen for FINDCONTENT messages. When a FINDCONTENT message is received either the requested content or the nodes known to be closest to the content are returned via a FOUNDCONTENT message. - # H - JSON-RPC Endpoints that require for the portal network wire protocol. diff --git a/portal-wire-protocol.md b/portal-wire-protocol.md index 6bdafa8..c021a70 100644 --- a/portal-wire-protocol.md +++ b/portal-wire-protocol.md @@ -17,7 +17,9 @@ Unsupported messages **SHOULD** receive a `TALKRESP` message with an empty paylo All protocol identifiers consist of two bytes. The first byte is "`P`" (`0x50`), to indicate "the Portal network", the second byte is a specific network identifier. ### Mainnet identifiers + Currently defined mainnet protocol identifiers: + - Inclusive range of `0x5000` - `0x5009`: Reserved for future networks or network upgrades - `0x500A`: Execution State Network - `0x500B`: Execution History Network @@ -25,8 +27,11 @@ Currently defined mainnet protocol identifiers: - `0x500D`: Execution Canonical Transaction Index Network - `0x500E`: Execution Verkle State Network - `0x500F`: Execution Transaction Gossip Network + ### Angelfood identifiers + Currently defined `angelfood` protocol identifiers: + - `0x504A`: Execution State Network - `0x504B`: Execution History Network - `0x504C`: Beacon Chain Network @@ -50,18 +55,18 @@ The SHA256 Content ID derivation function is defined as: content_id = sha256(content_key) ``` - ## Nodes and Node IDs -Nodes in the portal network are represented by their [EIP-778 Ethereum Node Record (ENR)](https://eips.ethereum.org/EIPS/eip-778) from the Discovery v5 network. A node's `node-id` is derived according to the node's identity scheme, which is specified in the node's ENR. A node's `node-id` represents its address in the DHT. Node IDs are interchangeable between 32 byte identifiers and 256 bit integers. +Nodes in the portal network are represented by their [EIP-778 Ethereum Node Record (ENR)](https://eips.ethereum.org/EIPS/eip-778) from the Discovery v5 network. A node's `node-id` is derived according to the node's identity scheme, which is specified in the node's ENR. A node's `node-id` represents its address in the DHT. Node IDs are interchangeable between 32 byte identifiers and 256 bit integers. ## Request - Response Messages The messages in the protocol are transmitted using the `TALKREQ` and `TALKRESP` messages from the base [Node Discovery Protocol](https://github.com/ethereum/devp2p/blob/master/discv5/discv5-wire.md#talkreq-request-0x05). All messages in the protocol have a request-response interaction: -* Request messages **MUST** be sent using a `TALKREQ` message. -* Response messages **MUST** be sent using the corresponding `TALKRESP` message. + +- Request messages **MUST** be sent using a `TALKREQ` message. +- Response messages **MUST** be sent using the corresponding `TALKRESP` message. All messages are encoded as an [SSZ Union](https://github.com/ethereum/consensus-specs/blob/dev/ssz/simple-serialize.md#union) type. @@ -87,8 +92,8 @@ selector = 0x00 ping = Container(enr_seq: uint64, custom_payload: ByteList[2048]) ``` -* `enr_seq`: The node's current sequence number of their ENR record. -* `custom_payload`: Custom payload specified per the network. +- `enr_seq`: The node's current sequence number of their ENR record. +- `custom_payload`: Custom payload specified per the network. #### Pong (0x01) @@ -99,8 +104,8 @@ selector = 0x01 pong = Container(enr_seq: uint64, custom_payload: ByteList[2048]) ``` -* `enr_seq`: The node's current sequence number of their ENR record. -* `custom_payload`: Custom payload specified per the network. +- `enr_seq`: The node's current sequence number of their ENR record. +- `custom_payload`: Custom payload specified per the network. #### Find Nodes (0x02) @@ -111,9 +116,9 @@ selector = 0x02 find_nodes = Container(distances: List[uint16, max_length=256]) ``` -* `distances`: a list of distances for which the node is requesting ENR records for. - * Each distance **MUST** be within the inclusive range `[0, 256]` - * Each distance in the list **MUST** be unique. +- `distances`: a list of distances for which the node is requesting ENR records for. + - Each distance **MUST** be within the inclusive range `[0, 256]` + - Each distance in the list **MUST** be unique. #### Nodes (0x03) @@ -124,11 +129,10 @@ selector = 0x03 nodes = Container(total: uint8, enrs: List[ByteList[2048], max_length=32]) ``` -* `total`: The total number of `Nodes` response messages being sent. Currently fixed to only 1 response message. -* `enrs`: List of byte strings, each of which is an RLP encoded ENR record. - * Individual ENR records **MUST** correspond to one of the requested distances. - * It is invalid to return multiple ENR records for the same `node_id`. - * The ENR record of the requesting node **SHOULD** be filtered out of the list. +- `total`: The total number of `Nodes` response messages being sent. Currently fixed to only 1 response message. +- `enrs`: List of byte strings, each of which is an RLP encoded ENR record. + _ Individual ENR records **MUST** correspond to one of the requested distances. + _ It is invalid to return multiple ENR records for the same `node_id`. \* The ENR record of the requesting node **SHOULD** be filtered out of the list. #### Find Content (0x04) @@ -139,7 +143,7 @@ selector = 0x04 find_content = Container(content_key: ByteList[2048]) ``` -* `content_key`: The key for the content being requested. The encoding of `content_key` is specified per the network. +- `content_key`: The key for the content being requested. The encoding of `content_key` is specified per the network. #### Content (0x05) @@ -153,38 +157,41 @@ selector = 0x05 content = Union[connection_id: Bytes2, content: ByteList[2048], enrs: List[ByteList[2048], 32]] ``` -* `connection_id`: Connection ID to set up a uTP stream to transmit the requested data. - * Connection ID values **SHOULD** be randomly generated. -* `content`: byte string of the requested content. - * This field **MUST** be used when the requested data can fit in this single response. -* `enrs`: List of byte strings, each of which is an RLP encoded ENR record. - * The list of ENR records **MUST** be closest nodes to the requested content that the responding node has stored. - * The set of derived `node_id` values from the ENR records **MUST** be unique. - * The ENR record of the requesting & responding node **SHOULD** be filtered out of the list. +- `connection_id`: Connection ID to set up a uTP stream to transmit the requested data. + - Connection ID values **SHOULD** be randomly generated. +- `content`: byte string of the requested content. + - This field **MUST** be used when the requested data can fit in this single response. +- `enrs`: List of byte strings, each of which is an RLP encoded ENR record. + - The list of ENR records **MUST** be closest nodes to the requested content that the responding node has stored. + - The set of derived `node_id` values from the ENR records **MUST** be unique. + - The ENR record of the requesting & responding node **SHOULD** be filtered out of the list. If the node does not hold the requested content, and the node does not know of any nodes with eligible ENR values, then the node **MUST** return `enrs` as an empty list. -Upon *sending* this message with a `connection_id`, the sending node **SHOULD** *listen* for an incoming uTP stream with the generated `connection_id`. +Upon _sending_ this message with a `connection_id`, the sending node **SHOULD** _listen_ for an incoming uTP stream with the generated `connection_id`. -Upon *receiving* this message with a `connection_id`, the receiving node **SHOULD** *initiate* a uTP stream with the received `connection_id`. +Upon _receiving_ this message with a `connection_id`, the receiving node **SHOULD** _initiate_ a uTP stream with the received `connection_id`. ##### `content` Union Definition The `Union` defined in the `content` field of the `Content (0x05)` message is defined as below: **`connection_id`** + ``` selector = 0x00 ssz-type = Bytes2 ``` **`content`** + ``` selector = 0x01 ssz-type = ByteList[2048] ``` **`enrs`** + ``` selector = 0x02 ssz-type = List[ByteList[2048], 32] @@ -199,7 +206,7 @@ selector = 0x06 offer = Container(content_keys: List[ByteList[2048], max_length=64]) ``` -* `content_keys`: A list of encoded `content_key` entries. The encoding of each `content_key` is specified per the network. +- `content_keys`: A list of encoded `content_key` entries. The encoding of each `content_key` is specified per the network. #### Accept (0x07) @@ -212,14 +219,14 @@ selector = 0x07 accept = Container(connection_id: Bytes2, content_keys: BitList[max_length=64]] ``` -* `connection_id`: Connection ID to set up a uTP stream to transmit the requested data. - * ConnectionID values **SHOULD** be randomly generated. -* `content_keys`: Signals which content keys are desired. - * A bit-list corresponding to the offered keys with the bits in the positions of the desired keys set to `1`. +- `connection_id`: Connection ID to set up a uTP stream to transmit the requested data. + - ConnectionID values **SHOULD** be randomly generated. +- `content_keys`: Signals which content keys are desired. + - A bit-list corresponding to the offered keys with the bits in the positions of the desired keys set to `1`. -Upon *sending* this message, the requesting node **SHOULD** *listen* for an incoming uTP stream with the generated `connection_id`. +Upon _sending_ this message, the requesting node **SHOULD** _listen_ for an incoming uTP stream with the generated `connection_id`. -Upon *receiving* this message, the serving node **SHOULD** *initiate* a uTP stream with the received `connection_id`. +Upon _receiving_ this message, the serving node **SHOULD** _initiate_ a uTP stream with the received `connection_id`. ##### Content Encoding @@ -234,6 +241,7 @@ The maximum size allowed for this application is limited to `uint32`. The content item itself MUST be encoded as is defined for each specific network and content type. The encoded data of n encoded content items to be send over the stream can be formalized as: + ```py # n encoded content items to be send over the stream, with n <= 64 encoded_content_list = [content_0, content_1, ..., content_n] @@ -261,13 +269,11 @@ Similarly, we define a `logdistance` function identically to the Discovery v5 ne logdistance(a: uint256, b: uint256) = log2(distance(a, b)) ``` - ### Test Vectors A collection of test vectors for this specification can be found in the [Portal wire test vectors](./portal-wire-test-vectors.md) document. - ## Routing Table Most networks that use the Portal Wire Protocol will form an independent DHT which requires individual nodes to maintain a routing table. @@ -303,14 +309,12 @@ port := UDP port number ### Protocol Specific Node State -Sub protocols may define additional node state information which should be tracked in the node state database. This information will typically be transmitted in the `Ping.custom_data` and `Pong.custom_data` fields. - +Sub protocols may define additional node state information which should be tracked in the node state database. This information will typically be transmitted in the `Ping.custom_data` and `Pong.custom_data` fields. ## Algorithms Here we define a collection of generic algorithms which can be applied to a sub-protocol implementing the wire protocol. - ### Lookup The term lookup refers to the lookup algorithm described in section 2.3 of the Kademlia paper. @@ -339,9 +343,10 @@ To find a piece of content for `content-id`, a node performs a content lookup vi ### Storing Content -The concept of content storage is only applicable to sub-protocols that implement persistant storage of data. +The concept of content storage is only applicable to sub-protocols that implement persistent storage of data. Content will get stored by a node when: + - the node receives the content through the `Offer` - `Accept` message flow and the content falls within the node's radius - the node requests content through the `FindContent` - `Content` message flow and the content falls within the node's radius @@ -349,15 +354,15 @@ The network cannot make guarantees about the storage of particular content. A la ### Neighborhood Gossip -We use the term *neighborhood gossip* to refer to the process through which content is disseminated to all of the DHT nodes *near* the location in the DHT where the content is located. +We use the term _neighborhood gossip_ to refer to the process through which content is disseminated to all of the DHT nodes _near_ the location in the DHT where the content is located. The process works as follows: - A DHT node is offered and receives a piece of content that it is interested in. -- This DHT node checks their routing table for `k` nearby DHT nodes that should also be interested in the content. Those `k` nodes **SHOULD** not include the node that originally provided aformentioned content. +- This DHT node checks their routing table for `k` nearby DHT nodes that should also be interested in the content. Those `k` nodes **SHOULD** not include the node that originally provided aforementioned content. - If the DHT node finds `n` or more DHT nodes interested it selects `n` of these nodes and offers the content to them. - If the DHT node finds less than `n` DHT nodes interested, it launches a node lookup with target `content-id` and it -offers the content to maximum `n` of the newly discovered nodes. + offers the content to maximum `n` of the newly discovered nodes. The process above should quickly saturate the area of the DHT where the content is located and naturally terminate as more nodes become aware of the content.