-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop retry flow for error handling (log trigger) #12026
Stop retry flow for error handling (log trigger) #12026
Conversation
I see that you haven't updated any README files. Would it make sense to do so? |
I see that you haven't updated any CHANGELOG files. Would it make sense to do so? |
a3bd1f7
to
18b185f
Compare
3aff10d
to
cb19d4b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM: added some nitpicking comments.
core/services/ocr2/plugins/ocr2keeper/evmregistry/v21/encoding/interface.go
Outdated
Show resolved
Hide resolved
core/services/ocr2/plugins/ocr2keeper/evmregistry/v21/encoding/interface.go
Outdated
Show resolved
Hide resolved
core/services/ocr2/plugins/ocr2keeper/evmregistry/v21/mercury/streams/streams.go
Show resolved
Hide resolved
@@ -148,6 +151,7 @@ func (c *client) singleFeedRequest(ctx context.Context, ch chan<- mercury.Mercur | |||
c.lggr.Warnf("at block %s upkeep %s received status code %d for feed %s", sl.Time.String(), sl.UpkeepId.String(), httpResponse.StatusCode, sl.Feeds[index]) | |||
retryable = true | |||
state = encoding.MercuryFlakyFailure | |||
errCode = encoding.HttpToErrCode(httpResponse.StatusCode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this errCode is only updated for this group of status? why not move this assignment outside of switch in L149?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want the error code just in specific scenarios, so just to be sure I've preferred to do it specifically when needed.
core/services/ocr2/plugins/ocr2keeper/evmregistry/v21/mercury/v03/request.go
Outdated
Show resolved
Hide resolved
@@ -154,26 +157,31 @@ func (c *client) multiFeedsRequest(ctx context.Context, ch chan<- mercury.Mercur | |||
case http.StatusUnauthorized: | |||
retryable = false | |||
state = encoding.UpkeepNotAuthorized | |||
errCode = encoding.HttpToErrCode(resp.StatusCode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems we can save some lines by moving this assignment outside of switch-case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want the error code just in specific scenarios, being explicit here seems more safe.
- typo - using encoding.ErrCodeNil instead of 0 - rename ErrCodeMercuryError -> ErrCodeDataStreamsError
core/services/ocr2/plugins/ocr2keeper/evmregistry/v21/mercury/mercury.go
Show resolved
Hide resolved
resultLen := len(streamsLookup.Feeds) | ||
ch := make(chan mercury.MercuryData, resultLen) | ||
if len(streamsLookup.Feeds) == 0 { | ||
return encoding.NoPipelineError, encoding.UpkeepFailureReasonInvalidRevertDataInput, [][]byte{}, false, 0 * time.Second, fmt.Errorf("invalid revert data input: feed param key %s, time param key %s, feeds %s", streamsLookup.FeedParamKey, streamsLookup.TimeParamKey, streamsLookup.Feeds) | ||
return encoding.NoPipelineError, encoding.UpkeepFailureReasonInvalidRevertDataInput, [][]byte{}, false, 0 * time.Second, encoding.ErrCodeNil, fmt.Errorf("invalid revert data input: feed param key %s, time param key %s, feeds %s", streamsLookup.FeedParamKey, streamsLookup.TimeParamKey, streamsLookup.Feeds) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
technically this is also a user error and we can surface this in callback.
However, I'm not sure if we want to go through all 0.2 cases, properly define it and also test all cases. So one option is to just always return 0 error code here for 0.2 to satisfy the interface, while properly handle it for 0.3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for e.g. there are some errors like 0.2 permissioning which are done upstream and not in 0.2 module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can decide in followup PRs if we want exclude v0.2
@@ -237,20 +238,34 @@ func (s *streams) CheckCallback(ctx context.Context, values [][]byte, lookup *me | |||
} | |||
|
|||
func (s *streams) DoMercuryRequest(ctx context.Context, lookup *mercury.StreamsLookup, checkResults []ocr2keepers.CheckResult, i int) ([][]byte, error) { | |||
state, reason, values, retryable, retryInterval, err := encoding.NoPipelineError, encoding.UpkeepFailureReasonInvalidRevertDataInput, [][]byte{}, false, 0*time.Second, fmt.Errorf("invalid revert data input: feed param key %s, time param key %s, feeds %s", lookup.FeedParamKey, lookup.TimeParamKey, lookup.Feeds) | |||
state, reason, values, retryable, retryInterval, errCode, err := encoding.NoPipelineError, encoding.UpkeepFailureReasonInvalidRevertDataInput, [][]byte{}, false, 0*time.Second, encoding.ErrCodeNil, fmt.Errorf("invalid revert data input: feed param key %s, time param key %s, feeds %s", lookup.FeedParamKey, lookup.TimeParamKey, lookup.Feeds) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are these initial values used anywhere? I see we always overwrite them with either 0.2.DoRequest or 0.3.DoRequest, maybe simple to just declare them as vars?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could have declare some of the vars, but some has initial value and thats why they are defined like this
core/services/ocr2/plugins/ocr2keeper/evmregistry/v21/mercury/v03/request.go
Show resolved
Hide resolved
Quality Gate passedIssues Measures |
* Stop retry flow for error handling (log trigger) (#12026) * return timeout for retry interval * identify retryTimeout + placeholder for err handler * err codes * comments * added function and tests for handling err code * add checkErrorHandler (#12037) * Stop retry flow for error handling (conditional trigger) (#12032) * err codes for conditionals * unit tests for conditional * refactor streams error handler * fix mercury v0.2 request handling * fix mercury 0.3 DoRequest * fix tests * connect check error callback * add todo * add todo * improve comments * polish 0.2 * fix debug.go, refactor eth_call on checkCallback and checkErrorHandler, fix some minor comments (typo and test) * small fixes * calculate retry config for conditionals * rename to clarify function * cleanup pipeline execution errors * fix unit tests for v02_request.go * Fix bug in 0.3 request error code * add state assertion to single feed request tests * add test case for invalid response bytes * add tests for more retryable errors * add tests for retryable -> error conversion * polish 0.2 combining multiple feeds * add more tests for different combination of feed responses * remove unused fields * unit tests for v03 * fix mercury_test.go * minor polishing * thread control: added function that accept a context * use threadCtrl.GoCtx to ensure the timeout is applied with the provided context * added timeout for mercury requests (including retries) * fix lint * set timeout to 10s * use GoCtx within request clients * lint * add more 0.3 tests * set err code to timeout if ctx is done * Finalize stream error codes, polish requests to return consistent nil bytes upon error, use HttpToStreamsErrCode everywhere * add tests for checkErrorHanlder * treat timeout as non retryable error codes * allow empty feed request which returns error code * update test contract with a flag for checkErr result * generate wrappers for LogTriggeredStreamsLookup * handling empty performData case for err handler * test (wip) * fixing contract * waiting for err handler logs (wip) * update contract and generate wrappers * lint * use startBlock instead of 1 * add missing arg * check multiple responses: - server timeout - unauthorized - bad req - internal server err - not found * fix test * cleanup * set timeout in http client * Add timeout test for streams (#12170) * Add timeout test * fix DoRequest() to consider ctx (client v0.3) * make sure we timeout in time * align thread control test * hacky context timeout * trying to wait for timeout with child context * push hack * update * another fix * fix * use ctx background * add log * Use new ctx to implement timeout * fix test * add v0.2 test --------- Co-authored-by: amirylm <[email protected]> * Empty Commit --------- Co-authored-by: Amir Y <[email protected]> Co-authored-by: Lei <[email protected]>
* Stop retry flow for error handling (log trigger) (#12026) * return timeout for retry interval * identify retryTimeout + placeholder for err handler * err codes * comments * added function and tests for handling err code * add checkErrorHandler (#12037) * Stop retry flow for error handling (conditional trigger) (#12032) * err codes for conditionals * unit tests for conditional * refactor streams error handler * fix mercury v0.2 request handling * fix mercury 0.3 DoRequest * fix tests * connect check error callback * add todo * add todo * improve comments * polish 0.2 * fix debug.go, refactor eth_call on checkCallback and checkErrorHandler, fix some minor comments (typo and test) * small fixes * calculate retry config for conditionals * rename to clarify function * cleanup pipeline execution errors * fix unit tests for v02_request.go * Fix bug in 0.3 request error code * add state assertion to single feed request tests * add test case for invalid response bytes * add tests for more retryable errors * add tests for retryable -> error conversion * polish 0.2 combining multiple feeds * add more tests for different combination of feed responses * remove unused fields * unit tests for v03 * fix mercury_test.go * minor polishing * thread control: added function that accept a context * use threadCtrl.GoCtx to ensure the timeout is applied with the provided context * added timeout for mercury requests (including retries) * fix lint * set timeout to 10s * use GoCtx within request clients * lint * add more 0.3 tests * set err code to timeout if ctx is done * Finalize stream error codes, polish requests to return consistent nil bytes upon error, use HttpToStreamsErrCode everywhere * add tests for checkErrorHanlder * treat timeout as non retryable error codes * allow empty feed request which returns error code * update test contract with a flag for checkErr result * generate wrappers for LogTriggeredStreamsLookup * handling empty performData case for err handler * test (wip) * fixing contract * waiting for err handler logs (wip) * update contract and generate wrappers * lint * use startBlock instead of 1 * add missing arg * check multiple responses: - server timeout - unauthorized - bad req - internal server err - not found * fix test * cleanup * set timeout in http client * Add timeout test for streams (#12170) * Add timeout test * fix DoRequest() to consider ctx (client v0.3) * make sure we timeout in time * align thread control test * hacky context timeout * trying to wait for timeout with child context * push hack * update * another fix * fix * use ctx background * add log * Use new ctx to implement timeout * fix test * add v0.2 test --------- Co-authored-by: amirylm <[email protected]> * Empty Commit --------- Co-authored-by: Amir Y <[email protected]> Co-authored-by: Lei <[email protected]>
* Stream fallback: error handler (#12173) * Stop retry flow for error handling (log trigger) (#12026) * return timeout for retry interval * identify retryTimeout + placeholder for err handler * err codes * comments * added function and tests for handling err code * add checkErrorHandler (#12037) * Stop retry flow for error handling (conditional trigger) (#12032) * err codes for conditionals * unit tests for conditional * refactor streams error handler * fix mercury v0.2 request handling * fix mercury 0.3 DoRequest * fix tests * connect check error callback * add todo * add todo * improve comments * polish 0.2 * fix debug.go, refactor eth_call on checkCallback and checkErrorHandler, fix some minor comments (typo and test) * small fixes * calculate retry config for conditionals * rename to clarify function * cleanup pipeline execution errors * fix unit tests for v02_request.go * Fix bug in 0.3 request error code * add state assertion to single feed request tests * add test case for invalid response bytes * add tests for more retryable errors * add tests for retryable -> error conversion * polish 0.2 combining multiple feeds * add more tests for different combination of feed responses * remove unused fields * unit tests for v03 * fix mercury_test.go * minor polishing * thread control: added function that accept a context * use threadCtrl.GoCtx to ensure the timeout is applied with the provided context * added timeout for mercury requests (including retries) * fix lint * set timeout to 10s * use GoCtx within request clients * lint * add more 0.3 tests * set err code to timeout if ctx is done * Finalize stream error codes, polish requests to return consistent nil bytes upon error, use HttpToStreamsErrCode everywhere * add tests for checkErrorHanlder * treat timeout as non retryable error codes * allow empty feed request which returns error code * update test contract with a flag for checkErr result * generate wrappers for LogTriggeredStreamsLookup * handling empty performData case for err handler * test (wip) * fixing contract * waiting for err handler logs (wip) * update contract and generate wrappers * lint * use startBlock instead of 1 * add missing arg * check multiple responses: - server timeout - unauthorized - bad req - internal server err - not found * fix test * cleanup * set timeout in http client * Add timeout test for streams (#12170) * Add timeout test * fix DoRequest() to consider ctx (client v0.3) * make sure we timeout in time * align thread control test * hacky context timeout * trying to wait for timeout with child context * push hack * update * another fix * fix * use ctx background * add log * Use new ctx to implement timeout * fix test * add v0.2 test --------- Co-authored-by: amirylm <[email protected]> * Empty Commit --------- Co-authored-by: Amir Y <[email protected]> Co-authored-by: Lei <[email protected]> * Empty Commit --------- Co-authored-by: Amir Y <[email protected]> Co-authored-by: Lei <[email protected]>
AUTO-9005
Description
The idea is to stop retry ceremony after 6 rounds (first 5 retries 1 second, 1 retry 5 second) and then invoke error handler.
NOTES
timeout
once the retry round is 6, but in followup PRs we will make a call to err handler throughexecuteCallback()
Changes
DoRequest
checkErrHandler
Testing