-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Quorum Key Resharding Service #446
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
phew this was a huge chunk of work - nice job dude! all my comments are minor/nits.
1) Generate the configuration for how to reshard the given quorum keys. This configuration is called the `ReshardInput`. | ||
2) Boot the enclave in reshard mode using the `ReshardInput`. | ||
3) A threshold of the _old_ share holders query the enclave for an attestation document. | ||
4) A threshold of the _old_ share holders re-encrypt their shares to the enclaves ephemeral key and post those shares in a single message. The data structure used to group a users shares and their corresponding quorum keys is called the `ReshardProvisionInput`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"to group a user's shares and their corresponding quorum keys" (N encrypted user shares + N quorum keys in each ReshardProvisionInput
right?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If N = quorum key count, then a single share holder will submit N shares (one for each quorum key)
qos_client reshard-re-encrypt-share \ | ||
--yubikey \ | ||
--quorum-share-dir-multiple <read: path to directory to specify> \ | ||
--attestation-doc-path <read: path to attestation doc> \ | ||
--provision-input-path <write: path to the file to write this users provision input> \ | ||
--reshard-input-path <read: path to reshard input> \ | ||
--qos-release-dir <read: path to a dir with pcrs file and release.env> \ | ||
--pcr3-preimage-path <read: path to the IAM role for the enclave> \ | ||
--new-share-set-dir <read: path to dir new share set> \ | ||
--old-share-set-dir <read: path to dir with old_share_set> \ | ||
--alias <alias for share holder> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really understand why this part has to be different from a normal share posting step where quorum key shares are posted to the enclave 🤔
I guess the nature of resharding is compatible, but because we've chosen to implement "bulk" resharding, it's not compatible with the existing interface. That's a tradeoff: we could reuse the normal share posting logic at the expense of doing resharding 4 times (once per quorum key)
What I'm thinking is we could have most of the resharding logic out of QOS and inside of a normal enclave app (reshard
):
- boot the reshard app with an existing quorum key to reshard + pivot args which specify the new share set
- the app's only job (once qos pivots) is to reshard the quorum key according to the pivot args
- there's then an interface for new share set members to download attestations and new encrypted shares.
Rince and repeat for each quorum key to reshard (you'd need to boot 4 reshard enclaves with 4 different pivot args to reshard the 4 quorum keys we currently have)
Obviously that's a big departure from what you have here so I don't think it's worth implementing; just thinking out loud.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We talked about building an app, and in fact we could design such an app that you bulk post shares. The real determining factor was just that it was a bit less code to put this directly in qos (didn't have to make a seperate host and then also a client for talking to that host). Others justifications is that
- this is super sensitive, and we want the enclave to shut down if anything goes wrong during the flow
- having the configuration as pivot args is hard to read and kind of non-ergonomic
- we expect anyone using qos will want this base operation, so worth batching it in
- the sharding logic is native to qos, so its clean to do the resharding in the same place
At the end of the day just tradeoffs - don't feel strongly that there is any right or wrong decisions
/// Waiting for quorum key shards to be posted so the reshard service can | ||
/// be executed. | ||
ReshardWaitingForQuorumShards, | ||
/// Reshard service has completed | ||
ReshardBooted, | ||
/// Reshard failed to reconstruct the quorum key | ||
UnrecoverableReshardFailedBadShares, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems a bit odd to make these new phases part of the "protocol". They represent an entirely separate state machine IMO: we don't expect a resharding enclave to reach "QuorumKeyProvisioned" for example.
(at the same time, I understand that it's easiest to put this here instead of having a separate ReshardingProtocolPhase
enum)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to keep as one enum to cohesively show every possible state, even if the state transitions map is more like a DAG
I wanted the states downstream of a boot reshard to be distinct, such that the state machine would only allow reshard functionality in order limit surface area
pub struct ReshardInput { | ||
/// List of quorum public keys | ||
pub quorum_keys: Vec<Vec<u8>>, | ||
/// The set and threshold to shard the key. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we have multiple quorum keys: "to shard the quorum keys"
wip wip get to compile Initial reshard provision and unit test code clean up fix n choose k refactor n choose k Finish tests for reshard add test for boot reshard Add generate reshard input wip wip wip wip Get reshard-renencrypt working Get post share working lint wip Build get reshard output Add new secrets for file key get full thing working e2e refactor wip get qos core to compile Get all of qos core working refactor to not use quorumpubkey wrapper wip Get e2e tests working with new input Add logic for checking that e2e share recombination works finish integration test lint stuff Improve human verifiactions clean up
|
||
### Added | ||
|
||
- BREAKING CHANGE: qos_core: quorum key resharding service, new state machine transitions, and new `ProtocolMsg` variants (#428) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wanted to highlight this breaking change - @r-n-o lmk if you want to merge your qos_net pr first so you can update qos in mono without worrying about breaking changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me! If you don't mind having my PR go first I'll take you on this offer!
I think this PR needs a rebase on top of the current @emostov @r-n-o has any of our base goals with this PR changed since this was last worked on in June? |
TODO:
Summary & Motivation (Problem vs. Solution)
QOS users need a way to reshard quorum keys.
This PR solves the issue by adding a new boot mode specifically for resharding keys. The process is similar to running a boot standard, in that members from the share set need to post shares to reconstruct the quorum keys. Once quorum keys are reconstructed, they are resharded to the new share set. New share set members can then pull their shares and verify that they can decrypt them.
Prior to reviewing the code in detail, take a look at
src/qos_client/RESHARD_GUIDE.md
, which describes how to execute quorum key reshardingHow I Tested These Changes
related: https://github.com/tkhq/gitops/pull/1133