From 9bd0730a2aacbdbb0a96eb6e6bfe7d0eb2165a9b Mon Sep 17 00:00:00 2001 From: evan-forbes Date: Sun, 11 Jun 2023 20:55:06 -0500 Subject: [PATCH 1/5] docs: data square layout part 1 --- ...e_defaults.go => non_interaction_rules.go} | 0 ..._test.go => non_interaction_rules_test.go} | 0 specs/src/README.md | 3 +-- specs/src/SUMMARY.md | 3 +-- specs/src/rationale/index.md | 3 --- specs/src/specs/block_proposer.md | 4 +-- .../data_square_layout.md | 25 +++++++++---------- specs/src/specs/data_structures.md | 6 ++--- specs/src/specs/networking.md | 4 +-- specs/src/specs/shares.md | 1 + 10 files changed, 22 insertions(+), 27 deletions(-) rename pkg/shares/{non_interactive_defaults.go => non_interaction_rules.go} (100%) rename pkg/shares/{non_interactive_defaults_test.go => non_interaction_rules_test.go} (100%) delete mode 100644 specs/src/rationale/index.md rename specs/src/{rationale => specs}/data_square_layout.md (60%) create mode 100644 specs/src/specs/shares.md diff --git a/pkg/shares/non_interactive_defaults.go b/pkg/shares/non_interaction_rules.go similarity index 100% rename from pkg/shares/non_interactive_defaults.go rename to pkg/shares/non_interaction_rules.go diff --git a/pkg/shares/non_interactive_defaults_test.go b/pkg/shares/non_interaction_rules_test.go similarity index 100% rename from pkg/shares/non_interactive_defaults_test.go rename to pkg/shares/non_interaction_rules_test.go diff --git a/specs/src/README.md b/specs/src/README.md index f37c12b9b7..40dff4276e 100644 --- a/specs/src/README.md +++ b/specs/src/README.md @@ -7,8 +7,7 @@ - [Block Validity Rules](./specs/block_validity_rules.md) - [Networking](./specs/networking.md) - [Public-Key Cryptography](./specs/public_key_cryptography.md) -- [Rationale](./rationale/index.md) - - [Data Square Layout](./rationale/data_square_layout.md) + - [Data Square Layout](./specs/data_square_layout.md) - [State Machine Modules](./specs/state_machine_modules.md) - [blob](../../x/blob/README.md) - [qgb](../../x/qgb/README.md) diff --git a/specs/src/SUMMARY.md b/specs/src/SUMMARY.md index add086ba59..ec4c4bb861 100644 --- a/specs/src/SUMMARY.md +++ b/specs/src/SUMMARY.md @@ -9,8 +9,7 @@ - [Block Validity Rules](./specs/block_validity_rules.md) - [Networking](./specs/networking.md) - [Public-Key Cryptography](./specs/public_key_cryptography.md) -- [Rationale](./rationale/index.md) - - [Data Square Layout](./rationale/data_square_layout.md) + - [Data Square Layout](./specs/data_square_layout.md) - [State Machine Modules](./specs/state_machine_modules.md) - [blob](../../x/blob/README.md) - [qgb](../../x/qgb/README.md) diff --git a/specs/src/rationale/index.md b/specs/src/rationale/index.md deleted file mode 100644 index ac0b9f03a2..0000000000 --- a/specs/src/rationale/index.md +++ /dev/null @@ -1,3 +0,0 @@ -# Rationale - -- [Data Square Layout](./data_square_layout.md) diff --git a/specs/src/specs/block_proposer.md b/specs/src/specs/block_proposer.md index 41f6880eea..e53acec673 100644 --- a/specs/src/specs/block_proposer.md +++ b/specs/src/specs/block_proposer.md @@ -18,7 +18,7 @@ With these restrictions in mind, the block proposer performs the following actio 1. Collect as many transactions and blobs from the mempool as possible, such that the total number of shares is at most [`AVAILABLE_DATA_ORIGINAL_SQUARE_MAX`](./consensus.md#constants). 1. Compute the smallest square size that is a power of 2 that can fit the number of shares. 1. Attempt to lay out the collected transactions and blobs in the current square. - 1. If the square is too small to fit all transactions and blobs (which may happen [due to needing to insert padding between blobs](../rationale/data_square_layout.md)) and the square size is smaller than [`AVAILABLE_DATA_ORIGINAL_SQUARE_MAX`](./consensus.md#constants), double the size of the square and repeat the above step. - 1. If the square is too small to fit all transactions and blobs (which may happen [due to needing to insert padding between blobs](../rationale/data_square_layout.md)) and the square size is at [`AVAILABLE_DATA_ORIGINAL_SQUARE_MAX`](./consensus.md#constants), drop the transactions and blobs until the data fits within the square. + 1. If the square is too small to fit all transactions and blobs (which may happen [due to needing to insert padding between blobs](../specs/data_square_layout.md)) and the square size is smaller than [`AVAILABLE_DATA_ORIGINAL_SQUARE_MAX`](./consensus.md#constants), double the size of the square and repeat the above step. + 1. If the square is too small to fit all transactions and blobs (which may happen [due to needing to insert padding between blobs](../specs/data_square_layout.md)) and the square size is at [`AVAILABLE_DATA_ORIGINAL_SQUARE_MAX`](./consensus.md#constants), drop the transactions and blobs until the data fits within the square. Note: the maximum padding shares between blobs should be at most twice the number of blob shares. Doubling the square size (i.e. quadrupling the number of shares in the square) should thus only have to happen at most once. diff --git a/specs/src/rationale/data_square_layout.md b/specs/src/specs/data_square_layout.md similarity index 60% rename from specs/src/rationale/data_square_layout.md rename to specs/src/specs/data_square_layout.md index d87da4d08f..112f91ffce 100644 --- a/specs/src/rationale/data_square_layout.md +++ b/specs/src/specs/data_square_layout.md @@ -4,25 +4,24 @@ ## Preamble -Celestia uses [a data availability scheme](https://arxiv.org/abs/1809.09044) that allows nodes to determine whether a block's data was published without downloading the whole block. The core of this scheme is arranging data in a two-dimensional matrix then applying erasure coding to each row and column. This document describes the rationale for how data—transactions, blobs, and other data—[is actually arranged](../specs/data_structures.md#arranging-available-data-into-shares). Familiarity with the [originally proposed data layout format](https://arxiv.org/abs/1809.09044) is assumed. +Celestia uses [a data availability scheme](https://arxiv.org/abs/1809.09044) that allows nodes to determine whether a block's data was published without downloading the whole block. The core of this scheme is arranging data in a two-dimensional matrix then applying erasure coding to each row and column. This document describes the rationale for how data—transactions, blobs, and other data—[is actually arranged](./data_structures.md#arranging-available-data-into-shares). Familiarity with the [originally proposed data layout format](https://arxiv.org/abs/1809.09044) is assumed. -## Rationale +## Layout Rationale Block data consists of: -1. Cosmos SDK module transactions (e.g. [MsgSend](https://github.com/cosmos/cosmos-sdk/blob/f71df80e93bffbf7ce5fbd519c6154a2ee9f991b/proto/cosmos/bank/v1beta1/tx.proto#L21-L32)). These modify the Celestia chain's state. -1. Celestia-specific transactions (e.g. [PayForBlobs](../specs/data_structures.md#payforblobdata)). These modify the Celestia chain's state. -1. Intermediate state roots: required for fraud proofs of the aforementioned transactions. +1. Standard cosmos-SDK transactions: (which are often represented internally as the [`sdk.Tx` interface](https://github.com/celestiaorg/cosmos-sdk/blob/v1.14.0-sdk-v0.46.11/types/tx_msg.go#L42-L50)) as described in [cosmos-sdk ADR020](https://github.com/celestiaorg/cosmos-sdk/blob/v1.14.0-sdk-v0.46.11/docs/architecture/adr-020-protobuf-transaction-encoding.md) + 1. These transactions contain protobuf encoded [`sdk.Msg`](https://github.com/celestiaorg/cosmos-sdk/blob/v1.14.0-sdk-v0.46.11/types/tx_msg.go#L14-L26)s, which get executed atomically (if one fails they all fail) to update the Celestia state. The complete list of modules, which define the `sdk.Msg`s that the state machine is capable of handling, can be found in the [state machine modules spec](../specs/state_machine_modules.md). Examples include standard cosmos-sdk module messages such as [MsgSend](https://github.com/cosmos/cosmos-sdk/blob/f71df80e93bffbf7ce5fbd519c6154a2ee9f991b/proto/cosmos/bank/v1beta1/tx.proto#L21-L32)), and celestia specific module messages such as [`MsgPayForBlob`](https://github.com/celestiaorg/celestia-app/blob/v1.0.0-rc2/proto/celestia/blob/v1/tx.proto#L16-L31) 1. Blobs: binary blobs which do not modify the Celestia state, but which are intended for a Celestia application identified with a provided namespace. -We want to arrange this data into a `k * k` matrix of fixed-sized shares, which will later be committed to in [Namespace Merkle Trees (NMTs)](../specs/data_structures.md#namespace-merkle-tree) so that individual shares in this matrix can be proven to belong to a single data root. +We want to arrange this data into a `k * k` matrix of fixed-sized [shares](../specs/shares.md), which will later be committed to in [Namespace Merkle Trees (NMTs)](https://github.com/celestiaorg/nmt/blob/v0.16.0/docs/spec/nmt.md) so that individual shares in this matrix can be proven to belong to a single data root. The simplest way we can imagine arranging block data is to simply serialize it all in no particular order, split it into fixed-sized shares, then arrange those shares into the `k * k` matrix in row-major order. However, this naive scheme can be improved in a number of ways, described below. First, we impose some ground rules: 1. Data must be ordered by namespace. This makes queries into a NMT commitment of that data more efficient. -1. Since non-blob data are not naturally intended for particular namespaces, we assign reserved namespaces for them. A range of namespaces is reserved for this purpose, starting from the lowest possible namespace. +1. Since non-blob data are not naturally intended for particular namespaces, we assign [reserved namespaces](./consensus.md#Reservered-Namespaces) for them. A range of namespaces is reserved for this purpose, starting from the lowest possible namespace. 1. By construction, the above two rules mean that non-blob data always precedes blob data in the row-major matrix, even when considering single rows or columns. 1. Data with different namespaces must not be in the same share. This might cause a small amount of wasted block space, but makes the NMT easier to reason about in general since leaves are guaranteed to belong to a single namespace. @@ -37,16 +36,16 @@ Specifically, blobs must begin at a new share. We note a nice property from this This, however, requires the block producer to interact with the transaction sender to provide them the starting location of their blob, so that the sender can sign over the commitment based on that starting location. This can be done selectively, but is not ideal as a default for e.g. end-user wallets. -### Non-Interactive Default Rules +### Non-Interaction Rules -As a non-consensus-critical default, we can impose one additional rule on blob placement to make the possible starting locations of blobs sufficiently predictable and constrained such that users can deterministically compute subtree roots without interaction: +We impose one additional rule on blob placement to make the possible starting locations of blobs sufficiently predictable and constrained such that users can deterministically compute subtree roots without the need to interact with the block proposer: -> Blobs start at an index that is a multiple of the blob minimum square size. The blob minimum square size is the smallest square that can contain the blob in isolation (i.e. a square with only this blob and no other transactions or blobs). +> Blobs must start at an index that is a multiple of the `SubtreeWidth`. The `SubtreeWidth` is the length of the blob in shares, divided by the [`SubtreeRootThreshold`](https://github.com/celestiaorg/celestia-app/blob/v1.0.0-rc2/pkg/appconsts/v1/app_consts.go#L6) ([implementation here](https://github.com/celestiaorg/celestia-app/blob/v1.0.0-rc2/pkg/shares/non_interactive_defaults.go#L94-L116)). -In the constraint mentioned above, the number of rows/columns in the minimum square size should be a power of 2. -With the above constraint, we can compute subtree roots deterministically. In order to compute the subtree roots, split the blob into chunks that are of maximum size: blob minimum square size. As an example, a blob of length `11` has a minimum square size of `4` because `11` is not greater than `4 * 4 = 16` total shares. Split the blob into chunks of length `4, 4, 2, 1` because each chunk must be a power of `2`. The resulting slices are the leaves of subtrees whose roots can be computed. These subtree roots will be present as internal nodes in the NMT of _some_ row(s). +The `SubtreeRootThreshold` is an arbitrary versioned protocol constant that aims to put a soft limit the number of subtree roots included in a blob inclusion proof, as described in [ADR013](../../../docs/architecture/adr-013-non-interactive-default-rules-for-zero-padding.md). -This is similar to [Merkle Mountain Ranges](https://www.usenix.org/legacy/event/sec09/tech/full_papers/crosby.pdf), though with the largest subtree bounded by the blob minimum square size rather than being unbounded. +In the constraint mentioned above, the number of rows/columns in the minimum square size should be a power of 2. +With the above constraint, we can compute subtree roots deterministically. For example, a blob of 128 shares and SRT = 64, must start on a share index that is a multiple of 2 because 128/64 = 2. In this case, there will be a maximum of 1 share of padding between blobs (more on padding below). The maximum subtree width in shares will also be 2, meaning that there will be 2 shares under each subtree root. The last piece of the puzzle is determining _which_ row the blob is placed at (or, more specifically, the starting location). This is needed to keep the block producer accountable. To this end, the block producer simply augments each fee-paying transaction with the starting locations of the blobs the transaction pays for. diff --git a/specs/src/specs/data_structures.md b/specs/src/specs/data_structures.md index cbee8b4138..e4e977fb36 100644 --- a/specs/src/specs/data_structures.md +++ b/specs/src/specs/data_structures.md @@ -406,7 +406,7 @@ For shares **with a namespace equal to [`PARITY_SHARE_NAMESPACE`](./consensus.md #### Namespace Padding Share -A namespace padding share acts as padding between blobs so that the subsequent blob may begin at an index that conforms to the [non-interactive default rules](../rationale/data_square_layout.md#non-interactive-default-rules). A namespace padding share contains the namespace ID of the blob that precedes it in the data square so that the data square can retain the property that all shares are ordered by namespace. +A namespace padding share acts as padding between blobs so that the subsequent blob may begin at an index that conforms to the [non-interactive default rules](../specs/data_square_layout.md#non-interaction-rules). A namespace padding share contains the namespace ID of the blob that precedes it in the data square so that the data square can retain the property that all shares are ordered by namespace. The first [`NAMESPACE_SIZE`](./consensus.md#constants) of a share's raw data `rawData` is the namespace of the blob that precedes this padding share. The next [`SHARE_INFO_BYTES`](./consensus.md#constants) bytes are for share information. The sequence start indicator is always `1`. The version bits are filled with the share version. The sequence length is zeroed out. The remaining [`SHARE_SIZE`](./consensus.md#constants)`-`[`NAMESPACE_SIZE`](./consensus.md#constants)`-`[`SHARE_INFO_BYTES`](./consensus.md#constants) `-` [`SEQUENCE_BYTES`](./consensus.md#constants) bytes are filled with `0`. @@ -457,7 +457,7 @@ For each blob, it is placed in the available data matrix, with row-major order, 1. Place the first share of the blob at the next unused location in the matrix, then place the remaining shares in the following locations. -Transactions [must commit to a Merkle root of a list of hashes](#transaction) that are each guaranteed (assuming the block is valid) to be subtree roots in one or more of the row NMTs. For additional info, see [the rationale document](../rationale/data_square_layout.md) for this section. +Transactions [must commit to a Merkle root of a list of hashes](#transaction) that are each guaranteed (assuming the block is valid) to be subtree roots in one or more of the row NMTs. For additional info, see [the rationale document](../specs/data_square_layout.md) for this section. However, with only the rule above, interaction between the block producer and transaction sender may be required to compute a commitment to the blob the transaction sender can sign over. To remove interaction, blobs can optionally be laid out using a non-interactive default: @@ -468,7 +468,7 @@ In the example below, two blobs (of lengths 2 and 1, respectively) are placed us ![fig: original data blob](./figures/rs2d_originaldata_blob.svg) -The non-interactive default rules may introduce empty shares that do not belong to any blob (in the example above, the top-right share is empty). These are zeroes with namespace ID equal to the either [`TAIL_TRANSACTION_PADDING_NAMESPACE_ID`](./consensus.md#constants) if between a request with a reserved namespace ID and a blob, or the namespace ID of the previous blob if succeeded by a blob. See the [rationale doc](../rationale/data_square_layout.md) for more info. +The non-interactive default rules may introduce empty shares that do not belong to any blob (in the example above, the top-right share is empty). These are zeroes with namespace ID equal to the either [`TAIL_TRANSACTION_PADDING_NAMESPACE_ID`](./consensus.md#constants) if between a request with a reserved namespace ID and a blob, or the namespace ID of the previous blob if succeeded by a blob. See the [rationale doc](../specs/data_square_layout.md) for more info. ## Available Data diff --git a/specs/src/specs/networking.md b/specs/src/specs/networking.md index 129f90e2ff..205c8f9566 100644 --- a/specs/src/specs/networking.md +++ b/specs/src/specs/networking.md @@ -59,9 +59,9 @@ Defined as `MsgWirePayForData`: Accepting a `MsgWirePayForData` into the mempool requires different logic than other transactions in Celestia, since it leverages the paradigm of block proposers being able to malleate transaction data. Unlike [SignedTransactionDataMsgPayForData](./data_structures.md#signedtransactiondatamsgpayfordata) (the canonical data type that is included in blocks and committed to with a data root in the block header), each `MsgWirePayForData` (the over-the-wire representation of the same) has potentially multiple signatures. -Transaction senders who want to pay for a blob will create a [SignedTransactionDataMsgPayForData](./data_structures.md#signedtransactiondatamsgpayfordata) object, `stx`, filling in the `stx.blobShareCommitment` field [based on the non-interactive default rules](../rationale/data_square_layout.md#non-interactive-default-rules), then signing it to get a [transaction](./data_structures.md#transaction) `tx`. +Transaction senders who want to pay for a blob will create a [SignedTransactionDataMsgPayForData](./data_structures.md#signedtransactiondatamsgpayfordata) object, `stx`, filling in the `stx.blobShareCommitment` field [based on the non-interaction rules](../specs/data_square_layout.md#non-interaction-rules), then signing it to get a [transaction](./data_structures.md#transaction) `tx`. -Receiving a `MsgWirePayForData` object from the network follows the reverse process: verify using the [non-interactive default rules](../rationale/data_square_layout.md#non-interactive-default-rules) that the signature is valid. +Receiving a `MsgWirePayForData` object from the network follows the reverse process: verify using the [non-interaction rules](../specs/data_square_layout.md#non-interaction-rules) that the signature is valid. ## Invalid Erasure Coding diff --git a/specs/src/specs/shares.md b/specs/src/specs/shares.md new file mode 100644 index 0000000000..806e046c43 --- /dev/null +++ b/specs/src/specs/shares.md @@ -0,0 +1 @@ +# Shares From 5fed294d871f85f0839540859c4cec56acc3dd1a Mon Sep 17 00:00:00 2001 From: evan-forbes Date: Mon, 12 Jun 2023 09:41:58 -0500 Subject: [PATCH 2/5] docs: add power of 2 note --- specs/src/specs/data_square_layout.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/specs/src/specs/data_square_layout.md b/specs/src/specs/data_square_layout.md index 112f91ffce..afa3b46914 100644 --- a/specs/src/specs/data_square_layout.md +++ b/specs/src/specs/data_square_layout.md @@ -40,7 +40,7 @@ This, however, requires the block producer to interact with the transaction send We impose one additional rule on blob placement to make the possible starting locations of blobs sufficiently predictable and constrained such that users can deterministically compute subtree roots without the need to interact with the block proposer: -> Blobs must start at an index that is a multiple of the `SubtreeWidth`. The `SubtreeWidth` is the length of the blob in shares, divided by the [`SubtreeRootThreshold`](https://github.com/celestiaorg/celestia-app/blob/v1.0.0-rc2/pkg/appconsts/v1/app_consts.go#L6) ([implementation here](https://github.com/celestiaorg/celestia-app/blob/v1.0.0-rc2/pkg/shares/non_interactive_defaults.go#L94-L116)). +> Blobs must start at an index that is a multiple of the `SubtreeWidth`. The `SubtreeWidth` is the length of the blob in shares, divided by the [`SubtreeRootThreshold`](https://github.com/celestiaorg/celestia-app/blob/v1.0.0-rc2/pkg/appconsts/v1/app_consts.go#L6) and rounded up to the nearest power of 2 ([implementation here](https://github.com/celestiaorg/celestia-app/blob/v1.0.0-rc2/pkg/shares/non_interactive_defaults.go#L94-L116)). The `SubtreeRootThreshold` is an arbitrary versioned protocol constant that aims to put a soft limit the number of subtree roots included in a blob inclusion proof, as described in [ADR013](../../../docs/architecture/adr-013-non-interactive-default-rules-for-zero-padding.md). From 33a1169005ee299c28ddc045e2f74294476ff8bd Mon Sep 17 00:00:00 2001 From: Callum Waters Date: Thu, 15 Jun 2023 10:18:11 +0200 Subject: [PATCH 3/5] Apply suggestions from code review Co-authored-by: Rootul P --- specs/src/specs/data_square_layout.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/specs/src/specs/data_square_layout.md b/specs/src/specs/data_square_layout.md index afa3b46914..942b0e2a89 100644 --- a/specs/src/specs/data_square_layout.md +++ b/specs/src/specs/data_square_layout.md @@ -11,8 +11,8 @@ Celestia uses [a data availability scheme](https://arxiv.org/abs/1809.09044) tha Block data consists of: 1. Standard cosmos-SDK transactions: (which are often represented internally as the [`sdk.Tx` interface](https://github.com/celestiaorg/cosmos-sdk/blob/v1.14.0-sdk-v0.46.11/types/tx_msg.go#L42-L50)) as described in [cosmos-sdk ADR020](https://github.com/celestiaorg/cosmos-sdk/blob/v1.14.0-sdk-v0.46.11/docs/architecture/adr-020-protobuf-transaction-encoding.md) - 1. These transactions contain protobuf encoded [`sdk.Msg`](https://github.com/celestiaorg/cosmos-sdk/blob/v1.14.0-sdk-v0.46.11/types/tx_msg.go#L14-L26)s, which get executed atomically (if one fails they all fail) to update the Celestia state. The complete list of modules, which define the `sdk.Msg`s that the state machine is capable of handling, can be found in the [state machine modules spec](../specs/state_machine_modules.md). Examples include standard cosmos-sdk module messages such as [MsgSend](https://github.com/cosmos/cosmos-sdk/blob/f71df80e93bffbf7ce5fbd519c6154a2ee9f991b/proto/cosmos/bank/v1beta1/tx.proto#L21-L32)), and celestia specific module messages such as [`MsgPayForBlob`](https://github.com/celestiaorg/celestia-app/blob/v1.0.0-rc2/proto/celestia/blob/v1/tx.proto#L16-L31) -1. Blobs: binary blobs which do not modify the Celestia state, but which are intended for a Celestia application identified with a provided namespace. + 1. These transactions contain protobuf encoded [`sdk.Msg`](https://github.com/celestiaorg/cosmos-sdk/blob/v1.14.0-sdk-v0.46.11/types/tx_msg.go#L14-L26)s, which get executed atomically (if one fails they all fail) to update the Celestia state. The complete list of modules, which define the `sdk.Msg`s that the state machine is capable of handling, can be found in the [state machine modules spec](../specs/state_machine_modules.md). Examples include standard cosmos-sdk module messages such as [MsgSend](https://github.com/cosmos/cosmos-sdk/blob/f71df80e93bffbf7ce5fbd519c6154a2ee9f991b/proto/cosmos/bank/v1beta1/tx.proto#L21-L32)), and celestia specific module messages such as [`MsgPayForBlobs`](https://github.com/celestiaorg/celestia-app/blob/v1.0.0-rc2/proto/celestia/blob/v1/tx.proto#L16-L31) +1. Blobs: binary large objects which do not modify the Celestia state, but which are intended for a Celestia application identified with a provided namespace. We want to arrange this data into a `k * k` matrix of fixed-sized [shares](../specs/shares.md), which will later be committed to in [Namespace Merkle Trees (NMTs)](https://github.com/celestiaorg/nmt/blob/v0.16.0/docs/spec/nmt.md) so that individual shares in this matrix can be proven to belong to a single data root. @@ -36,16 +36,16 @@ Specifically, blobs must begin at a new share. We note a nice property from this This, however, requires the block producer to interact with the transaction sender to provide them the starting location of their blob, so that the sender can sign over the commitment based on that starting location. This can be done selectively, but is not ideal as a default for e.g. end-user wallets. -### Non-Interaction Rules +### Blob Share Commitment Rules -We impose one additional rule on blob placement to make the possible starting locations of blobs sufficiently predictable and constrained such that users can deterministically compute subtree roots without the need to interact with the block proposer: +To make the possible starting locations of blobs sufficiently predictable and constrained such that users can deterministically compute subtree roots, needed for the `ShareCommitment`s within a PFB, without the need to interact with the block proposer, we impose one additional rule: > Blobs must start at an index that is a multiple of the `SubtreeWidth`. The `SubtreeWidth` is the length of the blob in shares, divided by the [`SubtreeRootThreshold`](https://github.com/celestiaorg/celestia-app/blob/v1.0.0-rc2/pkg/appconsts/v1/app_consts.go#L6) and rounded up to the nearest power of 2 ([implementation here](https://github.com/celestiaorg/celestia-app/blob/v1.0.0-rc2/pkg/shares/non_interactive_defaults.go#L94-L116)). -The `SubtreeRootThreshold` is an arbitrary versioned protocol constant that aims to put a soft limit the number of subtree roots included in a blob inclusion proof, as described in [ADR013](../../../docs/architecture/adr-013-non-interactive-default-rules-for-zero-padding.md). +The `SubtreeRootThreshold` is an arbitrary versioned protocol constant that aims to put a soft limit on the number of subtree roots included in a blob inclusion proof, as described in [ADR013](../../../docs/architecture/adr-013-non-interactive-default-rules-for-zero-padding.md). A higher `SubtreeRootThreshold` means less padding and more tightly packed squares but also means greater proof sizes. In the constraint mentioned above, the number of rows/columns in the minimum square size should be a power of 2. -With the above constraint, we can compute subtree roots deterministically. For example, a blob of 128 shares and SRT = 64, must start on a share index that is a multiple of 2 because 128/64 = 2. In this case, there will be a maximum of 1 share of padding between blobs (more on padding below). The maximum subtree width in shares will also be 2, meaning that there will be 2 shares under each subtree root. +With the above constraint, we can compute subtree roots deterministically. For example, a blob of 128 shares and `SubtreeRootThreshold` (SRT) = 64, must start on a share index that is a multiple of 2 because 128/64 = 2. In this case, there will be a maximum of 1 share of padding between blobs (more on padding below). The maximum subtree width in shares will also be 2, meaning that there will be 2 shares under each subtree root. The last piece of the puzzle is determining _which_ row the blob is placed at (or, more specifically, the starting location). This is needed to keep the block producer accountable. To this end, the block producer simply augments each fee-paying transaction with the starting locations of the blobs the transaction pays for. From bfa854b75d11c8ebcd51073e9f03c48798e5b758 Mon Sep 17 00:00:00 2001 From: evan-forbes Date: Thu, 15 Jun 2023 14:09:30 -0500 Subject: [PATCH 4/5] docs: remove old line --- specs/src/specs/data_square_layout.md | 2 -- 1 file changed, 2 deletions(-) diff --git a/specs/src/specs/data_square_layout.md b/specs/src/specs/data_square_layout.md index 942b0e2a89..48d905927b 100644 --- a/specs/src/specs/data_square_layout.md +++ b/specs/src/specs/data_square_layout.md @@ -43,8 +43,6 @@ To make the possible starting locations of blobs sufficiently predictable and co > Blobs must start at an index that is a multiple of the `SubtreeWidth`. The `SubtreeWidth` is the length of the blob in shares, divided by the [`SubtreeRootThreshold`](https://github.com/celestiaorg/celestia-app/blob/v1.0.0-rc2/pkg/appconsts/v1/app_consts.go#L6) and rounded up to the nearest power of 2 ([implementation here](https://github.com/celestiaorg/celestia-app/blob/v1.0.0-rc2/pkg/shares/non_interactive_defaults.go#L94-L116)). The `SubtreeRootThreshold` is an arbitrary versioned protocol constant that aims to put a soft limit on the number of subtree roots included in a blob inclusion proof, as described in [ADR013](../../../docs/architecture/adr-013-non-interactive-default-rules-for-zero-padding.md). A higher `SubtreeRootThreshold` means less padding and more tightly packed squares but also means greater proof sizes. - -In the constraint mentioned above, the number of rows/columns in the minimum square size should be a power of 2. With the above constraint, we can compute subtree roots deterministically. For example, a blob of 128 shares and `SubtreeRootThreshold` (SRT) = 64, must start on a share index that is a multiple of 2 because 128/64 = 2. In this case, there will be a maximum of 1 share of padding between blobs (more on padding below). The maximum subtree width in shares will also be 2, meaning that there will be 2 shares under each subtree root. The last piece of the puzzle is determining _which_ row the blob is placed at (or, more specifically, the starting location). This is needed to keep the block producer accountable. To this end, the block producer simply augments each fee-paying transaction with the starting locations of the blobs the transaction pays for. From 2a24af891702c42bb3c2d7f15076ba57f0e55e46 Mon Sep 17 00:00:00 2001 From: evan-forbes Date: Thu, 15 Jun 2023 14:33:45 -0500 Subject: [PATCH 5/5] docs: replace non-interactive/non-interaction --- ...raction_rules.go => blob_share_commitment_rules.go} | 10 +++++----- ...les_test.go => blob_share_commitment_rules_test.go} | 2 +- pkg/shares/padding.go | 2 +- specs/src/specs/data_structures.md | 6 +++--- specs/src/specs/networking.md | 4 ++-- x/blob/README.md | 4 ++-- x/blob/types/payforblob.go | 6 +++--- 7 files changed, 17 insertions(+), 17 deletions(-) rename pkg/shares/{non_interaction_rules.go => blob_share_commitment_rules.go} (89%) rename pkg/shares/{non_interaction_rules_test.go => blob_share_commitment_rules_test.go} (99%) diff --git a/pkg/shares/non_interaction_rules.go b/pkg/shares/blob_share_commitment_rules.go similarity index 89% rename from pkg/shares/non_interaction_rules.go rename to pkg/shares/blob_share_commitment_rules.go index ba1c9eddd5..22ed0e05f6 100644 --- a/pkg/shares/non_interaction_rules.go +++ b/pkg/shares/blob_share_commitment_rules.go @@ -9,9 +9,9 @@ import ( // FitsInSquare uses the non interactive default rules to see if blobs of // some lengths will fit in a square of squareSize starting at share index // cursor. Returns whether the blobs fit in the square and the number of -// shares used by blobs. See non-interactive default rules -// https://github.com/celestiaorg/celestia-specs/blob/master/src/rationale/message_block_layout.md#non-interactive-default-rules -// https://github.com/celestiaorg/celestia-app/blob/1b80b94a62c8c292f569e2fc576e26299985681a/docs/architecture/adr-009-non-interactive-default-rules-for-reduced-padding.md +// shares used by blobs. See blob share commitment rules +// ../../specs/src/specs/data_square_layout.md#blob-share-commitment-rules +// ../../docs/architecture/adr-013-non-interactive-default-rules-for-reduced-padding.md func FitsInSquare(cursor, squareSize, subtreeRootThreshold int, blobShareLens ...int) (bool, int) { if len(blobShareLens) == 0 { if cursor <= squareSize*squareSize { @@ -30,7 +30,7 @@ func FitsInSquare(cursor, squareSize, subtreeRootThreshold int, blobShareLens .. } // BlobSharesUsedNonInteractiveDefaults returns the number of shares used by a given set -// of blobs share lengths. It follows the non-interactive default rules and +// of blobs share lengths. It follows the blob share commitment rules and // returns the share indexes for each blob. func BlobSharesUsedNonInteractiveDefaults(cursor, squareSize, subtreeRootThreshold int, blobShareLens ...int) (sharesUsed int, indexes []uint32) { start := cursor @@ -44,7 +44,7 @@ func BlobSharesUsedNonInteractiveDefaults(cursor, squareSize, subtreeRootThresho } // NextShareIndex determines the next index in a square that can be used. It -// follows the non-interactive default rules defined in ADR013. Assumes +// follows the blob share commitment rules defined in ADR013. Assumes // that all args are non negative, and that squareSize is a power of two. // https://github.com/celestiaorg/celestia-specs/blob/master/src/rationale/message_block_layout.md#non-interactive-default-rules // https://github.com/celestiaorg/celestia-app/blob/0334749a9e9b989fa0a42b7f011f4a79af8f61aa/docs/architecture/adr-013-non-interactive-default-rules-for-zero-padding.md diff --git a/pkg/shares/non_interaction_rules_test.go b/pkg/shares/blob_share_commitment_rules_test.go similarity index 99% rename from pkg/shares/non_interaction_rules_test.go rename to pkg/shares/blob_share_commitment_rules_test.go index eb8195f693..d675dfb6b2 100644 --- a/pkg/shares/non_interaction_rules_test.go +++ b/pkg/shares/blob_share_commitment_rules_test.go @@ -222,7 +222,7 @@ func TestNextShareIndex(t *testing.T) { expectedIndex: 11, }, { - name: "non-interactive default rules for reduced padding diagram", + name: "blob share commitment rules for reduced padding diagram", cursor: 11, blobLen: 11, squareSize: 8, diff --git a/pkg/shares/padding.go b/pkg/shares/padding.go index 32d3943cf2..d2ee22bb73 100644 --- a/pkg/shares/padding.go +++ b/pkg/shares/padding.go @@ -10,7 +10,7 @@ import ( // NamespacePaddingShare returns a share that acts as padding. Namespace padding // shares follow a blob so that the next blob may start at an index that -// conforms to non-interactive default rules. The ns parameter provided should +// conforms to blob share commitment rules. The ns parameter provided should // be the namespace of the blob that precedes this padding in the data square. func NamespacePaddingShare(ns appns.Namespace) (Share, error) { b, err := NewBuilder(ns, appconsts.ShareVersionZero, true).Init() diff --git a/specs/src/specs/data_structures.md b/specs/src/specs/data_structures.md index e4e977fb36..104cb7a3fa 100644 --- a/specs/src/specs/data_structures.md +++ b/specs/src/specs/data_structures.md @@ -406,13 +406,13 @@ For shares **with a namespace equal to [`PARITY_SHARE_NAMESPACE`](./consensus.md #### Namespace Padding Share -A namespace padding share acts as padding between blobs so that the subsequent blob may begin at an index that conforms to the [non-interactive default rules](../specs/data_square_layout.md#non-interaction-rules). A namespace padding share contains the namespace ID of the blob that precedes it in the data square so that the data square can retain the property that all shares are ordered by namespace. +A namespace padding share acts as padding between blobs so that the subsequent blob may begin at an index that conforms to the [blob share commitment rules](../specs/data_square_layout.md#blob-share-commitment-rules). A namespace padding share contains the namespace ID of the blob that precedes it in the data square so that the data square can retain the property that all shares are ordered by namespace. The first [`NAMESPACE_SIZE`](./consensus.md#constants) of a share's raw data `rawData` is the namespace of the blob that precedes this padding share. The next [`SHARE_INFO_BYTES`](./consensus.md#constants) bytes are for share information. The sequence start indicator is always `1`. The version bits are filled with the share version. The sequence length is zeroed out. The remaining [`SHARE_SIZE`](./consensus.md#constants)`-`[`NAMESPACE_SIZE`](./consensus.md#constants)`-`[`SHARE_INFO_BYTES`](./consensus.md#constants) `-` [`SEQUENCE_BYTES`](./consensus.md#constants) bytes are filled with `0`. #### Reserved Padding Share -Reserved padding shares are placed after the last reserved namespace share in the data square so that the first blob can start at an index that conforms to non-interactive default rules. Clients can safely ignore the contents of these shares because they don't contain any significant data. +Reserved padding shares are placed after the last reserved namespace share in the data square so that the first blob can start at an index that conforms to blob share commitment rules. Clients can safely ignore the contents of these shares because they don't contain any significant data. For shares **with a namespace ID equal to [`RESERVED_PADDING_NAMESPACE`](./consensus.md#constants)** (i.e. reserved padding shares): @@ -468,7 +468,7 @@ In the example below, two blobs (of lengths 2 and 1, respectively) are placed us ![fig: original data blob](./figures/rs2d_originaldata_blob.svg) -The non-interactive default rules may introduce empty shares that do not belong to any blob (in the example above, the top-right share is empty). These are zeroes with namespace ID equal to the either [`TAIL_TRANSACTION_PADDING_NAMESPACE_ID`](./consensus.md#constants) if between a request with a reserved namespace ID and a blob, or the namespace ID of the previous blob if succeeded by a blob. See the [rationale doc](../specs/data_square_layout.md) for more info. +The blob share commitment rules may introduce empty shares that do not belong to any blob (in the example above, the top-right share is empty). These are zeroes with namespace ID equal to the either [`TAIL_TRANSACTION_PADDING_NAMESPACE_ID`](./consensus.md#constants) if between a request with a reserved namespace ID and a blob, or the namespace ID of the previous blob if succeeded by a blob. See the [rationale doc](../specs/data_square_layout.md) for more info. ## Available Data diff --git a/specs/src/specs/networking.md b/specs/src/specs/networking.md index 205c8f9566..4a3fb0698e 100644 --- a/specs/src/specs/networking.md +++ b/specs/src/specs/networking.md @@ -59,9 +59,9 @@ Defined as `MsgWirePayForData`: Accepting a `MsgWirePayForData` into the mempool requires different logic than other transactions in Celestia, since it leverages the paradigm of block proposers being able to malleate transaction data. Unlike [SignedTransactionDataMsgPayForData](./data_structures.md#signedtransactiondatamsgpayfordata) (the canonical data type that is included in blocks and committed to with a data root in the block header), each `MsgWirePayForData` (the over-the-wire representation of the same) has potentially multiple signatures. -Transaction senders who want to pay for a blob will create a [SignedTransactionDataMsgPayForData](./data_structures.md#signedtransactiondatamsgpayfordata) object, `stx`, filling in the `stx.blobShareCommitment` field [based on the non-interaction rules](../specs/data_square_layout.md#non-interaction-rules), then signing it to get a [transaction](./data_structures.md#transaction) `tx`. +Transaction senders who want to pay for a blob will create a [SignedTransactionDataMsgPayForData](./data_structures.md#signedtransactiondatamsgpayfordata) object, `stx`, filling in the `stx.blobShareCommitment` field [based on the blob share commitmentrules](../specs/data_square_layout.md#blob-share-commitment-rules), then signing it to get a [transaction](./data_structures.md#transaction) `tx`. -Receiving a `MsgWirePayForData` object from the network follows the reverse process: verify using the [non-interaction rules](../specs/data_square_layout.md#non-interaction-rules) that the signature is valid. +Receiving a `MsgWirePayForData` object from the network follows the reverse process: verify using the [blob share commitmentrules](../specs/data_square_layout.md#blob-share-commitment-rules) that the signature is valid. ## Invalid Erasure Coding diff --git a/x/blob/README.md b/x/blob/README.md index 08c9f4bae7..e34978c035 100644 --- a/x/blob/README.md +++ b/x/blob/README.md @@ -13,7 +13,7 @@ The `x/blob` module enables users to pay for arbitrary data to be published to t 1. `NamespaceIds []byte`: the namespaces they wish to publish each blob to. The namespaces here must match the namespaces in the `Blob`s. 1. `ShareCommitment []byte`: a share commitment that is the root of a Merkle tree where the leaves are share commitments to each blob associated with this BlobTx. -After the `BlobTx` is submitted to the network, a block producer separates the transaction from the blob. Both components get included in the data square in different namespaces: the BlobTx gets included in the PayForBlobNamespace and the associated blob gets included in the namespace the user specified in the original `BlobTx`. Further reading: [Message Block Layout](https://github.com/celestiaorg/celestia-specs/blob/master/src/rationale/message_block_layout.md) +After the `BlobTx` is submitted to the network, a block producer separates the transaction from the blob. Both components get included in the data square in different namespaces: the BlobTx gets included in the PayForBlobNamespace and the associated blob gets included in the namespace the user specified in the original `BlobTx`. Further reading: [Data Square Layout](../../specs/src/specs/data_square_layout.md) After a block has been created, the user can verify that their data was included in a block via a blob inclusion proof. A blob inclusion proof uses the `ShareCommitment` in the original transaction and subtree roots of the block's data square to prove to the user that the shares that compose their original data do in fact exist in a particular block. @@ -29,7 +29,7 @@ When a `MsgPayForBlob` is processed, it consumes gas based on the blob size. ## PrepareProposal -When a block producer is preparing a block, they must perform an extra step for `BlobTx`s so that end-users can find the blob shares relevant to their submitted `BlobTx`. In particular, block proposers wrap the `BlobTx` in the PayForBlobs namespace with the index of the first share of the blob in the data square. See [Non-interactive Default Rules](https://github.com/celestiaorg/celestia-specs/blob/master/src/rationale/message_block_layout.md#non-interactive-default-rules) for more details. +When a block producer is preparing a block, they must perform an extra step for `BlobTx`s so that end-users can find the blob shares relevant to their submitted `BlobTx`. In particular, block proposers wrap the `BlobTx` in the PayForBlobs namespace with the index of the first share of the blob in the data square. See [Blob share commitment rules](../../specs/src/specs/data_square_layout.md#blob-share-commitment-rules) for more details. Since `BlobTx`s can contain multiple blobs, the `BlobTx` is wrapped with one share index per blob in the transaction. The index wrapped transaction is called an [IndexWrapper](https://github.com/celestiaorg/celestia-core/blob/2d2a65f59eabf1993804168414b86d758f30c383/proto/tendermint/types/types.proto#L192-L198) and this is the type that gets marshalled and written to the PayForBlobNamespace. diff --git a/x/blob/types/payforblob.go b/x/blob/types/payforblob.go index 601eb310c8..24c51d8f41 100644 --- a/x/blob/types/payforblob.go +++ b/x/blob/types/payforblob.go @@ -167,10 +167,10 @@ func (msg *MsgPayForBlobs) GetSigners() []sdk.AccAddress { } // CreateCommitment generates the share commitment for a given blob. -// See [Message layout rationale] and [Non-interactive default rules]. +// See [data square layout rationale] and [blob share commitment rules]. // -// [Message layout rationale]: https://github.com/celestiaorg/celestia-specs/blob/e59efd63a2165866584833e91e1cb8a6ed8c8203/src/rationale/message_block_layout.md?plain=1#L12 -// [Non-interactive default rules]: https://github.com/celestiaorg/celestia-specs/blob/e59efd63a2165866584833e91e1cb8a6ed8c8203/src/rationale/message_block_layout.md?plain=1#L36 +// [data square layout rationale]: ../../specs/src/specs/data_square_layout.md +// [blob share commitment rules]: ../../specs/src/specs/data_square_layout.md#blob-share-commitment-rules func CreateCommitment(blob *Blob) ([]byte, error) { coreblob := coretypes.Blob{ NamespaceID: blob.NamespaceId,