Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix typo #320

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 7 additions & 11 deletions implementation-details-overlay.md
Original file line number Diff line number Diff line change
Expand Up @@ -198,7 +198,7 @@ Each bucket is limited to `K` total members

### D.3.d - Replacement cache

Each bucket maintains a set of additional nodes known to be at the appropriate distance. When a node is removed from the routing table it is replaced by a node from the replacement cache when one is available. The cache is managed such that it remains disjoint from the nodes in the corresponding bucket.
Each bucket maintains a set of additional nodes known to be at the appropriate distance. When a node is removed from the routing table it is replaced by a node from the replacement cache when one is available. The cache is managed such that it remains disjoint from the nodes in the corresponding bucket.

## D.4 - Retrieve nodes at specified log-distance

Expand All @@ -218,15 +218,15 @@ The client uses a set of bootnodes to acquire an initial view of the network.

### E.1.a - Bootnodes

Each supported sub protocol can have its own set of bootnodes. These records can be either hard coded into the client or provided via client configuration.
Each supported sub protocol can have its own set of bootnodes. These records can be either hard coded into the client or provided via client configuration.

## E.2 - Population of routing table

The client actively seeks to populate its routing table by performing [RFN](#TODO) lookups to discover new nodes for the routing table

## E.3 - Liveliness checks

The client tracks *liveliness* of nodes in its routing table and periodically checks the liveliness of the node in its routing table which was least recently checked.
The client tracks _liveliness_ of nodes in its routing table and periodically checks the liveliness of the node in its routing table which was least recently checked.

### E.3.a - Rate Limiting Liviliness Checks

Expand All @@ -238,7 +238,7 @@ Management of stored content.

## F.1 - Content can be stored

Content can be stored in a persistent database. Databases are segmented by sub protocol.
Content can be stored in a persistent database. Databases are segmented by sub protocol.

## F.2 - Content can be retrieved by `content_id`

Expand All @@ -248,12 +248,10 @@ Given a known `content_id` the corresponding content payload can be retrieved.

Content can be removed.


## F.4 - Query furthest by distance

Retrieval of the content from the database which is furthest from a provided `node_id` using the custom distance function.


## F.5 - Total size of stored content

Retrieval of the total number of bytes stored.
Expand All @@ -274,7 +272,7 @@ The ability to listening for an inbound connection from another node with a `con

## G.2 - Enforcement of maximum stored content size

When the total size of stored content exceeds the configured maximum content storage size the content which is furthest from the local `node_id` is evicted in a timely manner. This should also result in any "data radius" values relevant to this network being adjusted.
When the total size of stored content exceeds the configured maximum content storage size the content which is furthest from the local `node_id` is evicted in a timely manner. This should also result in any "data radius" values relevant to this network being adjusted.

## G.3 - Retrieval via FINDCONTENT/FOUNDCONTENT & uTP

Expand All @@ -298,28 +296,26 @@ Support for receipt of content using the OFFER/ACCEPT messages and uTP sub proto

### G.4.a - Handle incoming gossip

Client can listen for incoming OFFER messages, responding with an ACCEPT message for any offered content which is of interest to the client.
Client can listen for incoming OFFER messages, responding with an ACCEPT message for any offered content which is of interest to the client.

#### G.4.a.1 - Receipt via uTP

After sending an ACCEPT response to an OFFER request the client listens for an inbound uTP stream with the `connection-id` that was sent with the ACCEPT response.

### G.4.b - Neighborhood Gossip Propogation
### G.4.b - Neighborhood Gossip Propagation

Upon receiving and validating gossip content, the content should then be gossiped to some set of interested nearby peers.

#### G.4.b.1 - Sending content via uTP

Upon receiving an ACCEPT message in response to our own OFFER message the client can initiate a uTP stream with the other node and can send the content payload across the stream.


## G.5 - Serving Content

The client should listen for FINDCONTENT messages.

When a FINDCONTENT message is received either the requested content or the nodes known to be closest to the content are returned via a FOUNDCONTENT message.


# H - JSON-RPC

Endpoints that require for the portal network wire protocol.
Expand Down
91 changes: 48 additions & 43 deletions portal-wire-protocol.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,16 +17,21 @@ Unsupported messages **SHOULD** receive a `TALKRESP` message with an empty paylo
All protocol identifiers consist of two bytes. The first byte is "`P`" (`0x50`), to indicate "the Portal network", the second byte is a specific network identifier.

### Mainnet identifiers

Currently defined mainnet protocol identifiers:

- Inclusive range of `0x5000` - `0x5009`: Reserved for future networks or network upgrades
- `0x500A`: Execution State Network
- `0x500B`: Execution History Network
- `0x500C`: Beacon Chain Network
- `0x500D`: Execution Canonical Transaction Index Network
- `0x500E`: Execution Verkle State Network
- `0x500F`: Execution Transaction Gossip Network

### Angelfood identifiers

Currently defined `angelfood` protocol identifiers:

- `0x504A`: Execution State Network
- `0x504B`: Execution History Network
- `0x504C`: Beacon Chain Network
Expand All @@ -50,18 +55,18 @@ The SHA256 Content ID derivation function is defined as:
content_id = sha256(content_key)
```


## Nodes and Node IDs

Nodes in the portal network are represented by their [EIP-778 Ethereum Node Record (ENR)](https://eips.ethereum.org/EIPS/eip-778) from the Discovery v5 network. A node's `node-id` is derived according to the node's identity scheme, which is specified in the node's ENR. A node's `node-id` represents its address in the DHT. Node IDs are interchangeable between 32 byte identifiers and 256 bit integers.
Nodes in the portal network are represented by their [EIP-778 Ethereum Node Record (ENR)](https://eips.ethereum.org/EIPS/eip-778) from the Discovery v5 network. A node's `node-id` is derived according to the node's identity scheme, which is specified in the node's ENR. A node's `node-id` represents its address in the DHT. Node IDs are interchangeable between 32 byte identifiers and 256 bit integers.

## Request - Response Messages

The messages in the protocol are transmitted using the `TALKREQ` and `TALKRESP` messages from the base [Node Discovery Protocol](https://github.com/ethereum/devp2p/blob/master/discv5/discv5-wire.md#talkreq-request-0x05).

All messages in the protocol have a request-response interaction:
* Request messages **MUST** be sent using a `TALKREQ` message.
* Response messages **MUST** be sent using the corresponding `TALKRESP` message.

- Request messages **MUST** be sent using a `TALKREQ` message.
- Response messages **MUST** be sent using the corresponding `TALKRESP` message.

All messages are encoded as an [SSZ Union](https://github.com/ethereum/consensus-specs/blob/dev/ssz/simple-serialize.md#union) type.

Expand All @@ -87,8 +92,8 @@ selector = 0x00
ping = Container(enr_seq: uint64, custom_payload: ByteList[2048])
```

* `enr_seq`: The node's current sequence number of their ENR record.
* `custom_payload`: Custom payload specified per the network.
- `enr_seq`: The node's current sequence number of their ENR record.
- `custom_payload`: Custom payload specified per the network.

#### Pong (0x01)

Expand All @@ -99,8 +104,8 @@ selector = 0x01
pong = Container(enr_seq: uint64, custom_payload: ByteList[2048])
```

* `enr_seq`: The node's current sequence number of their ENR record.
* `custom_payload`: Custom payload specified per the network.
- `enr_seq`: The node's current sequence number of their ENR record.
- `custom_payload`: Custom payload specified per the network.

#### Find Nodes (0x02)

Expand All @@ -111,9 +116,9 @@ selector = 0x02
find_nodes = Container(distances: List[uint16, max_length=256])
```

* `distances`: a list of distances for which the node is requesting ENR records for.
* Each distance **MUST** be within the inclusive range `[0, 256]`
* Each distance in the list **MUST** be unique.
- `distances`: a list of distances for which the node is requesting ENR records for.
- Each distance **MUST** be within the inclusive range `[0, 256]`
- Each distance in the list **MUST** be unique.

#### Nodes (0x03)

Expand All @@ -124,11 +129,10 @@ selector = 0x03
nodes = Container(total: uint8, enrs: List[ByteList[2048], max_length=32])
```

* `total`: The total number of `Nodes` response messages being sent. Currently fixed to only 1 response message.
* `enrs`: List of byte strings, each of which is an RLP encoded ENR record.
* Individual ENR records **MUST** correspond to one of the requested distances.
* It is invalid to return multiple ENR records for the same `node_id`.
* The ENR record of the requesting node **SHOULD** be filtered out of the list.
- `total`: The total number of `Nodes` response messages being sent. Currently fixed to only 1 response message.
- `enrs`: List of byte strings, each of which is an RLP encoded ENR record.
_ Individual ENR records **MUST** correspond to one of the requested distances.
_ It is invalid to return multiple ENR records for the same `node_id`. \* The ENR record of the requesting node **SHOULD** be filtered out of the list.
Comment on lines +132 to +135
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This unordered list went wrong.

And to add on that, all the unordered list changes are not needed in the first place as asterisks, pluses, and hyphens may all be used for this in markdown.

The same applies for the emphasis asterisks vs underscores.


#### Find Content (0x04)

Expand All @@ -139,7 +143,7 @@ selector = 0x04
find_content = Container(content_key: ByteList[2048])
```

* `content_key`: The key for the content being requested. The encoding of `content_key` is specified per the network.
- `content_key`: The key for the content being requested. The encoding of `content_key` is specified per the network.

#### Content (0x05)

Expand All @@ -153,38 +157,41 @@ selector = 0x05
content = Union[connection_id: Bytes2, content: ByteList[2048], enrs: List[ByteList[2048], 32]]
```

* `connection_id`: Connection ID to set up a uTP stream to transmit the requested data.
* Connection ID values **SHOULD** be randomly generated.
* `content`: byte string of the requested content.
* This field **MUST** be used when the requested data can fit in this single response.
* `enrs`: List of byte strings, each of which is an RLP encoded ENR record.
* The list of ENR records **MUST** be closest nodes to the requested content that the responding node has stored.
* The set of derived `node_id` values from the ENR records **MUST** be unique.
* The ENR record of the requesting & responding node **SHOULD** be filtered out of the list.
- `connection_id`: Connection ID to set up a uTP stream to transmit the requested data.
- Connection ID values **SHOULD** be randomly generated.
- `content`: byte string of the requested content.
- This field **MUST** be used when the requested data can fit in this single response.
- `enrs`: List of byte strings, each of which is an RLP encoded ENR record.
- The list of ENR records **MUST** be closest nodes to the requested content that the responding node has stored.
- The set of derived `node_id` values from the ENR records **MUST** be unique.
- The ENR record of the requesting & responding node **SHOULD** be filtered out of the list.

If the node does not hold the requested content, and the node does not know of any nodes with eligible ENR values, then the node **MUST** return `enrs` as an empty list.

Upon *sending* this message with a `connection_id`, the sending node **SHOULD** *listen* for an incoming uTP stream with the generated `connection_id`.
Upon _sending_ this message with a `connection_id`, the sending node **SHOULD** _listen_ for an incoming uTP stream with the generated `connection_id`.

Upon *receiving* this message with a `connection_id`, the receiving node **SHOULD** *initiate* a uTP stream with the received `connection_id`.
Upon _receiving_ this message with a `connection_id`, the receiving node **SHOULD** _initiate_ a uTP stream with the received `connection_id`.
Comment on lines -167 to +173
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unneeded change as mentioned above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _ and - changes are required by the ethereum/EIPs linter. We should review and merge Pipers PR before this one though @kdeme

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, fair enough. Same linter should be setup on this repo then also.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good. I will be happy if I can help in this matter


##### `content` Union Definition

The `Union` defined in the `content` field of the `Content (0x05)` message is defined as below:

**`connection_id`**

```
selector = 0x00
ssz-type = Bytes2
```

**`content`**

```
selector = 0x01
ssz-type = ByteList[2048]
```

**`enrs`**

```
selector = 0x02
ssz-type = List[ByteList[2048], 32]
Expand All @@ -199,7 +206,7 @@ selector = 0x06
offer = Container(content_keys: List[ByteList[2048], max_length=64])
```

* `content_keys`: A list of encoded `content_key` entries. The encoding of each `content_key` is specified per the network.
- `content_keys`: A list of encoded `content_key` entries. The encoding of each `content_key` is specified per the network.

#### Accept (0x07)

Expand All @@ -212,14 +219,14 @@ selector = 0x07
accept = Container(connection_id: Bytes2, content_keys: BitList[max_length=64]]
```

* `connection_id`: Connection ID to set up a uTP stream to transmit the requested data.
* ConnectionID values **SHOULD** be randomly generated.
* `content_keys`: Signals which content keys are desired.
* A bit-list corresponding to the offered keys with the bits in the positions of the desired keys set to `1`.
- `connection_id`: Connection ID to set up a uTP stream to transmit the requested data.
- ConnectionID values **SHOULD** be randomly generated.
- `content_keys`: Signals which content keys are desired.
- A bit-list corresponding to the offered keys with the bits in the positions of the desired keys set to `1`.

Upon *sending* this message, the requesting node **SHOULD** *listen* for an incoming uTP stream with the generated `connection_id`.
Upon _sending_ this message, the requesting node **SHOULD** _listen_ for an incoming uTP stream with the generated `connection_id`.

Upon *receiving* this message, the serving node **SHOULD** *initiate* a uTP stream with the received `connection_id`.
Upon _receiving_ this message, the serving node **SHOULD** _initiate_ a uTP stream with the received `connection_id`.
Comment on lines -220 to +229
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unneeded change as mentioned above.


##### Content Encoding

Expand All @@ -234,6 +241,7 @@ The maximum size allowed for this application is limited to `uint32`.
The content item itself MUST be encoded as is defined for each specific network and content type.

The encoded data of n encoded content items to be send over the stream can be formalized as:

```py
# n encoded content items to be send over the stream, with n <= 64
encoded_content_list = [content_0, content_1, ..., content_n]
Expand Down Expand Up @@ -261,13 +269,11 @@ Similarly, we define a `logdistance` function identically to the Discovery v5 ne
logdistance(a: uint256, b: uint256) = log2(distance(a, b))
```


### Test Vectors

A collection of test vectors for this specification can be found in the
[Portal wire test vectors](./portal-wire-test-vectors.md) document.


## Routing Table

Most networks that use the Portal Wire Protocol will form an independent DHT which requires individual nodes to maintain a routing table.
Expand Down Expand Up @@ -303,14 +309,12 @@ port := UDP port number

### Protocol Specific Node State

Sub protocols may define additional node state information which should be tracked in the node state database. This information will typically be transmitted in the `Ping.custom_data` and `Pong.custom_data` fields.

Sub protocols may define additional node state information which should be tracked in the node state database. This information will typically be transmitted in the `Ping.custom_data` and `Pong.custom_data` fields.

## Algorithms

Here we define a collection of generic algorithms which can be applied to a sub-protocol implementing the wire protocol.


### Lookup

The term lookup refers to the lookup algorithm described in section 2.3 of the Kademlia paper.
Expand Down Expand Up @@ -339,25 +343,26 @@ To find a piece of content for `content-id`, a node performs a content lookup vi

### Storing Content

The concept of content storage is only applicable to sub-protocols that implement persistant storage of data.
The concept of content storage is only applicable to sub-protocols that implement persistent storage of data.

Content will get stored by a node when:

- the node receives the content through the `Offer` - `Accept` message flow and the content falls within the node's radius
- the node requests content through the `FindContent` - `Content` message flow and the content falls within the node's radius

The network cannot make guarantees about the storage of particular content. A lazy node may ignore all `Offer` messages. A malicious node may send `Accept` messages and ignore the data transmissions. The `Offer` - `Accept` mechanism is in place to require that nodes explicitly accept some data before another node attempts to transmit that data. The mechanism prevents the unnecessary consumption of bandwidth in the presence of lazy nodes. However, it does not defend against malicious nodes who accept offers for data with no intent to store it.

### Neighborhood Gossip

We use the term *neighborhood gossip* to refer to the process through which content is disseminated to all of the DHT nodes *near* the location in the DHT where the content is located.
We use the term _neighborhood gossip_ to refer to the process through which content is disseminated to all of the DHT nodes _near_ the location in the DHT where the content is located.

The process works as follows:

- A DHT node is offered and receives a piece of content that it is interested in.
- This DHT node checks their routing table for `k` nearby DHT nodes that should also be interested in the content. Those `k` nodes **SHOULD** not include the node that originally provided aformentioned content.
- This DHT node checks their routing table for `k` nearby DHT nodes that should also be interested in the content. Those `k` nodes **SHOULD** not include the node that originally provided aforementioned content.
- If the DHT node finds `n` or more DHT nodes interested it selects `n` of these nodes and offers the content to them.
- If the DHT node finds less than `n` DHT nodes interested, it launches a node lookup with target `content-id` and it
offers the content to maximum `n` of the newly discovered nodes.
offers the content to maximum `n` of the newly discovered nodes.

The process above should quickly saturate the area of the DHT where the content is located and naturally terminate as more nodes become aware of the content.

Expand Down
Loading