-
Notifications
You must be signed in to change notification settings - Fork 675
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement ACP-118 Aggregator #3394
base: master
Are you sure you want to change the base?
Conversation
2e59a5d
to
e7648e5
Compare
368ad1a
to
2a47036
Compare
19d8f83
to
3af3bc9
Compare
6e6e88f
to
cf4ecba
Compare
b4d7b35
to
60761f8
Compare
Signed-off-by: Joshua Kim <[email protected]>
Signed-off-by: Joshua Kim <[email protected]>
Signed-off-by: Joshua Kim <[email protected]>
Signed-off-by: Joshua Kim <[email protected]>
Signed-off-by: Joshua Kim <[email protected]>
Signed-off-by: Joshua Kim <[email protected]>
Signed-off-by: Joshua Kim <[email protected]>
Signed-off-by: Joshua Kim <[email protected]>
Signed-off-by: Joshua Kim <[email protected]>
Signed-off-by: Joshua Kim <[email protected]>
Signed-off-by: Joshua Kim <[email protected]>
Signed-off-by: Joshua Kim <[email protected]>
Signed-off-by: Joshua Kim <[email protected]>
|
||
// NewClientWithPeers generates a client to communicate to a set of peers | ||
func NewClientWithPeers( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can make this a separate PR if requested - but I've gotten feedback in the past about PRs with test utilities where it might be hard to understand why it's needed without corresponding usage.
Signed-off-by: Joshua Kim <[email protected]>
Signed-off-by: Joshua Kim <[email protected]>
Signed-off-by: Joshua Kim <[email protected]>
Signed-off-by: Joshua Kim <[email protected]>
Signed-off-by: Joshua Kim <[email protected]>
for nodeID := range nodeIDs { | ||
network, ok := peerNetworks[nodeID] | ||
if !ok { | ||
return fmt.Errorf("%s is not connected", nodeID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the reason for the big testing diff... the test utility now enforces that you're sending requests to a node registered in the peer map. We could also just drop the requests instead of erroring as an alternative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this drop the requests since normally an error from the sender would be treated as a fatal error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could drop it - and this would give us more similar behavior to how the network sender implementation works. I think one downside is that we are only using this explicitly for testing and I think this makes tests easier to debug.
Signed-off-by: Joshua Kim <[email protected]>
network/p2p/acp118/aggregator.go
Outdated
} | ||
|
||
failedStakeWeight := uint64(0) | ||
minThreshold := (totalStakeWeight * quorumNum) / quorumDen |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this match https://github.com/ava-labs/avalanchego/blob/master/vms/platformvm/warp/signature.go#L150 exactly?
{ | ||
name: "aggregates from all validators 1/1", | ||
peers: map[ids.NodeID]p2p.Handler{ | ||
nodeID0: NewHandler(&testVerifier{}, signer0), | ||
}, | ||
ctx: context.Background(), | ||
validators: []Validator{ | ||
{ | ||
NodeID: nodeID0, | ||
PublicKey: pk0, | ||
Weight: 1, | ||
}, | ||
}, | ||
wantSigners: []int{0}, | ||
quorumNum: 1, | ||
quorumDen: 1, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super easy to read these test cases ❤️
quorumDen: 1, | ||
}, | ||
{ | ||
name: "aggregates from some validators - 1/3", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to make these names a little more descriptive to the edge case that they're testing?
ex.
name: "aggregates from some validators - 1/3", | |
name: "aggregates from min threshold - 1/3", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading through the rest, this is probably fine as is, since there's already a convention to these names for success/failure, could just be more explicit in the success case
quorumDen: 3, | ||
}, | ||
{ | ||
name: "aggregates from some validators - 2/3", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the intended difference for this test case? Just > 1 success rather than > 1 failure but still meeting minimum threshold?
It seems each success test case is meeting the exact required threshold. This is very well tested as is, but could also add cases for reaching greater than minimum threshold.
for nodeID := range nodeIDs { | ||
network, ok := peerNetworks[nodeID] | ||
if !ok { | ||
return fmt.Errorf("%s is not connected", nodeID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this drop the requests since normally an error from the sender would be treated as a fatal error?
network/p2p/acp118/aggregator.go
Outdated
// Fast-fail if it's not possible to generate a signature that meets the | ||
// minimum threshold | ||
failedStakeWeight += result.Validator.Weight | ||
if totalStakeWeight-failedStakeWeight < minThreshold { | ||
return nil, 0, 0, ErrFailedAggregation | ||
} | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work with hypersdk's expected usage here? I thought that num/dem
were going to be the maximum weights it would wait for, but that the minimum would be lower than that (meaning that if we are passing in the max here, we could be terminating when we realize we can't get the maximum... But we actually could have gotten the number that hypersdk wanted).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it makes more sense for us to change the behavior so that this api blocks until all responses come back, or we reach the provided num/den
threshold instead of failing instead.
network/p2p/acp118/aggregator.go
Outdated
if !bls.Verify(validator.PublicKey, signature, r.message.UnsignedMessage.Bytes()) { | ||
r.results <- result{Validator: validator, Err: errFailedVerification} | ||
return | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice
network/p2p/acp118/aggregator.go
Outdated
// AggregateSignatures blocks until quorumNum/quorumDen signatures from | ||
// validators are requested to be aggregated into a warp message or the context | ||
// is canceled. Returns the signed message and the amount of stake that signed | ||
// the message. Caller is responsible for providing a well-formed canonical | ||
// validator set corresponding to the signer bitset in the message. | ||
func (s *SignatureAggregator) AggregateSignatures( | ||
ctx context.Context, | ||
message *warp.Message, | ||
justification []byte, | ||
validators []Validator, | ||
quorumNum uint64, | ||
quorumDen uint64, | ||
) (*warp.Message, uint64, uint64, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this correctly handles the case that BLS public keys are shared across validators.
In Warp, only 1 signature is ever allowed from a BLS key in a warp message. If different nodeIDs have the same BLS key, their weights are aggregated for the BLS key's index
Co-authored-by: Stephen Buttolph <[email protected]> Signed-off-by: Joshua Kim <[email protected]>
Co-authored-by: Stephen Buttolph <[email protected]> Signed-off-by: Joshua Kim <[email protected]>
Co-authored-by: Stephen Buttolph <[email protected]> Signed-off-by: Joshua Kim <[email protected]>
Signed-off-by: Joshua Kim <[email protected]>
Signed-off-by: Joshua Kim <[email protected]>
Signed-off-by: Joshua Kim <[email protected]>
Signed-off-by: Joshua Kim <[email protected]>
Signed-off-by: Joshua Kim <[email protected]>
Signed-off-by: Joshua Kim <[email protected]>
Signed-off-by: Joshua Kim <[email protected]>
Signed-off-by: Joshua Kim <[email protected]>
Signed-off-by: Joshua Kim <[email protected]>
Signed-off-by: Joshua Kim <[email protected]>
Signed-off-by: Joshua Kim <[email protected]>
Signed-off-by: Joshua Kim <[email protected]>
Why this should be merged
Implements
p2p
client + server logic for signature request handling as described inacp-118
(ref).How this works
Client:
How this was tested