You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We would like to record "observed wiggles" duration metric in a Prometheus dashboard when the Astrid Block Consumer receives blocks from p2p at the chain tip. Probably using a summary metric like in UpdateBlockConsumerHeaderDownloadDelay works.
Calculating the wiggle is based on the signer (synonym for block producer/proposer) succession number.
For this, might be best to add a new HeaderWiggleCalculator (analogous to the existing HeaderDifficultyCalculator) with a function Wiggle(header) (uint64, error). Make sure to use the same signer's signaturesCache for efficiency. This may mean doing a small refactor to pass the same signaturesCache to the CanonicalChainBuilderFactory and to the HeaderWiggleCalculator.
A good place (option 1) to record this observation might be inside the Sync.applyNewBlockOnTip function. After connecting a new block using the canonical chain builder we get back a slice of newConnectedHeaders. We can loop over these to calculate the wiggle using the wiggle calculator and update the metric. The newConnectedHeaders slice should in most cases be quite short (bounded by the average milestone length in the worst case - distance from root of canonical chain build to the tip which is usually ~30 blocks on average on mainnet) so there should not be much of a performance impact of doing this on the hot path of chain tip processing. If concerned about this performance penalty then this metric update can be done async by spawning another goroutine or using a channel to some consumer that async will update the metric given the blocks. If taking the async approach make sure that there is a guarantee that all blocks will have their wiggles recorded in the same sequence as the headers were seen.
This model of ordered async processing of blocks already exists in the form of the polygon/sync package executionClientStore which is responsible for storing the blocks. Maybe this metric can be updated as part of its insertBlocks function (option 2). Another benefit of this is that the metric will then also be recorded if we hit the handleMilestoneTipMismatch flow and DownloadBlocksUsingMilestone. Also this metric will then be recorded for the blocks part of the syncToTip initial sync mode. Inside insertBlocks we can spawn a new goroutine for looping over all of the blocks and updating the wiggles metric using the wiggle calculator. To ensure ordering is correct id wait on that goroutine to finish at the very end just before exiting insertBlocks function. With this approach the metric will be updated in parallel while we are inserting blocks via the execution engine meaning there won't be any performance impact.
Another place (option 3) for this may be inside canonicalChainBuilder.Connect after all validation checks are done and just before adding a new forkTreeNode. The downside of this approach is that it introduces unnecessary metric logic to the canonical chain builder which we want to keep as slim and simple as possible and as business logic agnostic as possible.
Considering all options above, I'd recommend option 2.
The text was updated successfully, but these errors were encountered:
We would like to record "observed wiggles" duration metric in a Prometheus dashboard when the Astrid Block Consumer receives blocks from p2p at the chain tip. Probably using a summary metric like in
UpdateBlockConsumerHeaderDownloadDelay
works.Docs explaining what backup block producer wiggle times are in Bor - https://docs.polygon.technology/pos/architecture/bor/introduction/#wiggle-time-delay-before-backup-production
Example of how to calculate wiggle (this one is done for logging purposes in a different part of the flow but it is what we want to have a metric for) - https://github.com/erigontech/erigon/blob/main/polygon/bor/bor.go#L1203
Calculating the wiggle is based on the signer (synonym for block producer/proposer) succession number.
For this, might be best to add a new
HeaderWiggleCalculator
(analogous to the existingHeaderDifficultyCalculator
) with a functionWiggle(header) (uint64, error)
. Make sure to use the same signer'ssignaturesCache
for efficiency. This may mean doing a small refactor to pass the samesignaturesCache
to theCanonicalChainBuilderFactory
and to theHeaderWiggleCalculator
.A good place (option 1) to record this observation might be inside the
Sync.applyNewBlockOnTip
function. After connecting a new block using the canonical chain builder we get back a slice ofnewConnectedHeaders
. We can loop over these to calculate the wiggle using the wiggle calculator and update the metric. The newConnectedHeaders slice should in most cases be quite short (bounded by the average milestone length in the worst case - distance from root of canonical chain build to the tip which is usually ~30 blocks on average on mainnet) so there should not be much of a performance impact of doing this on the hot path of chain tip processing. If concerned about this performance penalty then this metric update can be done async by spawning another goroutine or using a channel to some consumer that async will update the metric given the blocks. If taking the async approach make sure that there is a guarantee that all blocks will have their wiggles recorded in the same sequence as the headers were seen.This model of ordered async processing of blocks already exists in the form of the polygon/sync package
executionClientStore
which is responsible for storing the blocks. Maybe this metric can be updated as part of itsinsertBlocks
function (option 2). Another benefit of this is that the metric will then also be recorded if we hit thehandleMilestoneTipMismatch
flow andDownloadBlocksUsingMilestone
. Also this metric will then be recorded for the blocks part of thesyncToTip
initial sync mode. InsideinsertBlocks
we can spawn a new goroutine for looping over all of the blocks and updating the wiggles metric using the wiggle calculator. To ensure ordering is correct id wait on that goroutine to finish at the very end just before exitinginsertBlocks
function. With this approach the metric will be updated in parallel while we are inserting blocks via the execution engine meaning there won't be any performance impact.Another place (option 3) for this may be inside
canonicalChainBuilder.Connect
after all validation checks are done and just before adding a newforkTreeNode
. The downside of this approach is that it introduces unnecessary metric logic to the canonical chain builder which we want to keep as slim and simple as possible and as business logic agnostic as possible.Considering all options above, I'd recommend option 2.
The text was updated successfully, but these errors were encountered: