-
Notifications
You must be signed in to change notification settings - Fork 159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
propose POA testnet #158
propose POA testnet #158
Conversation
Co-Authored-By: Jason Chai <[email protected]>
|
||
1. For block height `n` a validator checks `n % VALIDAROR_COUNT`. | ||
2. If `INDEX == n % VALIDATOR_COUNT`, which means the validator is in it's turn to attests block `n`. The validator should wait for `BLOCK_INTERVAL` seconds then attests the block `n` with difficulty set to `2`. | ||
3. If `INDEX != n % VALIDATOR_COUNT`, which means the validator is not in it's turn to attests block `n`. The validator should wait for `BLOCK_INTERVAL + rand(VALIDATOR_COUNT) * 0.5` seconds to wait for another attester to produce a new block. If there are no new block produced during the time, the validator should attest a new block with difficulty set to `1`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could the result of BLOCK_INTERVAL + rand(VALIDATOR_COUNT) * 0.5
be BLOCK_INTERVAL
? Or very close to BLOCK_INTERVAL
? If so a validator and attester may generate a new block at nearly the same time, and the validator's block arrives other nodes earlier than the attester's. This could cause a lot of 1-block fork switch on testnet.
I suggest adding a buffer to validator's wait period, e.g. BLOCK_INTERVAL * 2 + rand(VALIDATOR_COUNT) * 0.5
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could add another variable BLOCK_TIMEOUT
, set a value that slightly greater than BLOCK_INTERVAL
like 10s
.
Every validator who not in its turn waits for BLOCK_TIMEOUT + rand(VALIDATOR_COUNT) * 0.5
seconds to produce a block, this reduces the possibilities that two validators attest in the same time.
In fact, even there are two blocks produced at the same time it's won't hurt; the block that attested by an in-turn validator always has higher total difficulty
; even if two blocks have same total difficulty
, the next in turned validator can decide the main chain, the chain selection is just like the POW.
I am curious when validator "n % VALIDATOR_COUNT" did not produce a block, why can't we ask the next validator to produce two blocks in a row? That would be much simpler, and we have no risk of short forks. |
I'm not sure this mechanism could ensure converge in high probability. Forks happen easily, a validator attests for different forks in ATTEST_INTERVAL could not be considered malicious, or the consensus may halt. Then malicious validators can make use of this to break converge. |
I am also curious about the case when the corresponding validator of a round fails to propose a block. It is hard to catch the intuition behind the strategy. For example, what role the randomness plays in "rand(VALIDATOR_COUNT) * 0.5" and why the difficulty is set to "2 when an in-turn attester produces a block and set to 1 when a not in-turn attester produces a block". Is it possible that validators not-in-turn collude to propose a chain of continuous blocks of weight 1? I know this is what ATTEST_INTERVAL and the eviction rule are set for. However, for the eviction rule, it is hard to formally (by a program) tell adversaries from nodes suffering delays. I am worried that there can still be an attack by continuous 1-weight blocks despite the ATTEST_INTERVAL since adversary nodes may "pipeline" and the ATTEST_INTERVAL could not be too long since VALIDATOR_COUNT > ATTEST_INTERVAL must hold. |
It seems some questions are asked due to inconsistent views on the security model of testnet PoA:
|
|
The intention of this protocol is not to force converge to a single chain; instead of eliminating the fork, we eliminate the validators who make the fork; the core part of this protocol is to make sure that honest validators eventually can perform an eviction to remove malicious validators from the validator list. |
For 1, I am totally convinced. For 2-3, what I meant is I am concerned that "1" weight for not-in-turn blocks might be too great (it might look more natural if it is something parameterized by ATTEST_INTERVAL / VALIDATOR_COUNT), here is an example (which tells what I meant by "pipeline") that the adversary could forever propose 1-weight blocks with half of all nodes being adversaries.
Here we assume VALIDATOR_COUNT=6, ATTEST_INTERVAL=3 and 3 of them are malicious (sure the assumption is "<1/2 VALIDATOR_COUNT" rather than "<=1/2" so it might be somehow different). "1" (in the table) for Round_i and adv_j means the j-th adversary proposes a 1-weight block in round i. In this case, malicious nodes can cooperate to propose a chain of continuous 1-weight blocks of any large total weight they want. But I am not sure whether this will be a great problem to worry about, since this issue can be possibly solved by the eviction. |
|
https://github.com/nervosnetwork/rfcs/pull/158/files#diff-6eca90ec8afbdba69e3e0b53a6dcae4dR33 After produces block 1, Adv1 must wait for So Adv2 and Adv3 can't produce block |
Oh sorry, I did not make it clear. I just now (for simplicity) assumed a second block can be proposed when "current_index - previous_index >= ATTEST_INTERVAL" (but in fact, it should be ">"), so the adversary cannot propose block 4 in the real protocol. This is what I tried to mean by "sure the assumption is "<1/2 VALIDATOR_COUNT" rather than "<=1/2" so it might be somehow different". Sorry for not conveying it clearly. Anyway, I think this is not a great issue in practice.
Sure! this issue can be solved by eviction rule if detected in time. |
If we allow the validator to produce n blocks, how to distinguish that if there really n blocks are skipped or the validator is malicious? for example: For a 4 validator case, A, B is malicious, and C, D is honest. after B produces a block, A can pretend that C and D do not produce blocks within timeout, how C and D(or other nodes) handling this situation? |
C and D will keep producing blocks at their designated time slot, and their blocks will have heavier weights, so they will consider the chain constructed by themselves the authentic chain. Am I missing something? I am not familiar with POA so very likely I miss sth. |
I see. As long as we
we can always solve the problem off-line when an alarm is raised via manual examination. I have no more questions. |
This protocol is extremely suitable for testing smart contracts on the testnet. As a matter of fact, it is very unlikely to have a great portion of adversaries in the real testnet since they have no incentive to undermine the system for development. This protocol can rule out all attacks come up by me even with almost half total validator nodes malicious, as long as an alarm is raised in time. |
|
||
`ATTEST_INTERVAL` can be set to `VALIDATOR_COUNT / 2` the honest validators could eventually evict malicious validators unless the half of validators corrupted. | ||
|
||
One thing is that CKB uses 2-phase commitment, a transaction must be proposed first before committed in a block. This means the honest validators need at least produce two blocks to finally commit the eviction transaction, and these two blocks must be within the proposal window, so we choose a large enough value in the POA testnet: `TX_PROPOSAL_WINDOW` is set to `ProposalWindow(2, MAX_VALIDATOR_COUNT)`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch
Closed. See #194 |
Updates:
The Aggron testnet is use POW. Unfortunately, the average block time can be very slow even up to a few minutes due to mining power join and leave, see the peak from the chart https://explorer.nervos.org/aggron/charts.
To provide a stable testnet for development I propose a POA testnet to replace the current Aggron.
view file