Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using the share version to differentiate compact and sparse shares #60

Open
rootulp opened this issue Feb 7, 2023 · 9 comments
Labels
enhancement New feature or request

Comments

@rootulp
Copy link
Collaborator

rootulp commented Feb 7, 2023

Context

I wonder why both sparse shares and compact shares are the same version. To me, the versioning indicates a different way of interpreting an array of 512 bytes

@cmwaters

@musalbas
Copy link
Member

musalbas commented Feb 7, 2023

Spare shares and compact shares use the same universal share prefix as far as I know, therefore they should have the same version. The share version denotes the version of the universal share prefix, not the app-specific share prefix.

Similarly, a rollup would not get its own share versions just because it has its own share format after universal share prefix.

@rootulp rootulp changed the title Consider using the share version to different compact and sparse shares Consider using the share version to differentiate compact and sparse shares Feb 7, 2023
@rootulp
Copy link
Collaborator Author

rootulp commented Feb 13, 2023

Spare shares and compact shares use the same universal share prefix as far as I know, therefore they should have the same version

Correct sparse shares and compact shares use the same universal share prefix and they both currently use the version 0.

@rootulp
Copy link
Collaborator Author

rootulp commented Apr 14, 2023

FWIW an exploration of this proposal lives in https://github.com/cmwaters/shares/blob/main/shares.go
Note: this assumes the namespace version and share version are consolidated into one leading byte
cc: @cmwaters

@cmwaters
Copy link
Collaborator

I think ideally when you're coming up with an encoding protocol all information around encoding and decoding should be encapsulated in the version number (which should generally be in a fixed position, i.e. the first byte). Currently this is not the case, users need to parse the namespace to then know whether something is compact or sparse. In addition, we may want to change these rules in the future: say we want PFBs to be sparse or we allow users to decide whether they want their shares to be compact. Even the terms "compact" and "sparse" are called so in relation to each other but we might not even remain in a two mode world. What if we come up with a "supercompact" or "supersparse" share type so now each version of the universal prefix could be one of three of four subversions.

@musalbas
Copy link
Member

musalbas commented Apr 23, 2023

Afaik, users can already make the non-reserved shares compact by themselves tho, by using the same encoding scheme that we use in the reserved namespace. In this sense, calling reserved namespace shares "compact" is kind of misleading, because the thing that differentiates them from other shares is that they are in the reserved namespace, not whether they are compact or sparse. So it doesn't seem to make sense to add a different version number for shares in reserved namespace vs non-reserved namespace.

@cmwaters
Copy link
Collaborator

Afaik, users can already make the non-reserved shares compact by themselves tho, by using the same encoding scheme that we use in the reserved namespace

Users (block proposers) don't have any control over the encoding. It's enforced through consensus.

We could view it as being two different encoding schemes: one for reserved and one for the rest. I still don't quite like the idea that users need additional information to parse the shares (i.e. know the namespace it belongs to). It also might not be possible to rely on the namespace encoded in the share itself. Say we realise that reserved namespaces can be represented as a single byte (as all the rest are zeros) and thus we save 32 bytes per share. But now users will incorrectly parse the shares.

@musalbas
Copy link
Member

musalbas commented Apr 24, 2023

Users (block proposers) don't have any control over the encoding. It's enforced through consensus.

Both reserved and non-reserved namespaces use the same universal share prefix. What is different is the how data is encoded after the universal share prefix, which is considered to be application layer data. I don't see any reason why a message in a non-reserved namespace couldn't adapt the same encoding used after the universal share prefix? Note that the main difference between the reserved namespace and non-reserved namespace, is that in the reserved namespace, only one message is allowed (that is created by the validators). But why can't a particular message in a non-reserved namespace adopt the same formatting, even if other messages don't?

The key design philosophy of reserved namespaces, is that from a parser perspective, they should be treated like a rollup in any other namespace, and we shouldn't have special decoding logic for reserved vs non-reserved namespace shares. The idea is that ideally, celestia-app should be as conceptually close to any other rollup on the chain, as possible.

@musalbas
Copy link
Member

musalbas commented Apr 24, 2023

We could view it as being two different encoding schemes: one for reserved and one for the rest.

Both reserved and the rest have the same encoding scheme for the universal share prefix. However, all rollups need to choose a particular encoding scheme for the actual app data, that should be inside the application logic. The reserved namespace chose a specific format for its app data; but this doesn't mean the reserved namespace encoding is special or needs logic that is unique from any other rollup because it's reserved, because rollups will also need to make a choice about how they format their app data, because there are no "default" formatting rules for app data.

@cmwaters
Copy link
Collaborator

cmwaters commented Apr 24, 2023

I understand where you're coming from. The terminology is a little confusing. There should be the universal share encoding protocol and then kind of on top of that the reserved namespaces have their own encoding system (which allows multiple messages can be within a share a.k.a. multiple SDK messages are effectively in a single blob).

@rootulp rootulp transferred this issue from celestiaorg/celestia-app May 24, 2024
@evan-forbes evan-forbes added enhancement New feature or request and removed needs:triage labels May 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants