-
Notifications
You must be signed in to change notification settings - Fork 470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Execution service 2: separate execution from validation #1536
Conversation
create mock recorder, store execution run requests
made this draft to avoid accidental merging. |
Codecov Report
@@ Coverage Diff @@
## master #1536 +/- ##
==========================================
- Coverage 54.49% 50.76% -3.73%
==========================================
Files 221 270 +49
Lines 33099 31704 -1395
Branches 0 555 +555
==========================================
- Hits 18036 16095 -1941
- Misses 12906 13382 +476
- Partials 2157 2227 +70 |
Validation count client
Also solving a bug that caused to return identical first and last
call from staker don't prune batches required for reports
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a couple of minor comments and this has a merge conflict now, but it's looking good!
if v.recordSentA < countUint64 { | ||
v.recordSentA = countUint64 | ||
} | ||
v.validatedA = countUint64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't these all need to be atomic writes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, because we're write-holding the reorg-lock. There are comments but I'll make them a little clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, because we're write-holding the reorg-lock. There are comments but I'll make them a little clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add comments to created(), recordSent(), and validated() indicating that they need the reorg lock read-held. Right now Validated(t *testing.T) is called in a test without the lock, but we don't seem to enable the race detector for that test so we should be good.
if err != nil { | ||
if errors.Is(err, ErrGlobalStateNotInChain) && s.fatalErr != nil { | ||
fatal := fmt.Errorf("latest staked not in chain: %w", err) | ||
s.fatalErr <- fatal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this fatal error might prevent the node from performing a reorg it needs to do to correct its invalid state (because it shuts down before it's able to reorg). Perhaps this should only be fatal if the stakedInfo's inbox accumulator matches the inbox tracker's record, otherwise just return an error here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's about this.
We've already read the batch for this data and it's quite old (feed messages won't get us "caught up")
Most chances are there was some weird one-off error in calculating state that we can't really recover from.
I think being very noisy with this error is worth having to go through manual hoops in case it is recoverable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I guess if someone is error-looping on this they could always just disable the staker for a bit until the reorg happens.
Message pruner fixes
recordingDb: add config, metrics, and size limits
Pruner fixes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Block recorder is becomes function of execution client.
Heavy rewrite of block-validators to support new architecture, really simplifying them along the way.