Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nonce Error Handling #365

Closed
wants to merge 26 commits into from
Closed

Nonce Error Handling #365

wants to merge 26 commits into from

Conversation

augustbleeds
Copy link
Collaborator

@augustbleeds augustbleeds commented Mar 1, 2024

Summary:

  • Handle invalid nonce Error during broadcast and confirmation by clearing the txstore of all unconfirmed txs and resubmitting the txs with new nonces. During broadcast invalid nonce error can happen during fee estimation or tx invocation.
  • Have to use the centralized feeder gateway client to get the status of a rejected transaction (in order to determine if we should retry txs with new nonces) because that is the only way to do so today (asked Starkware)
  • You must now provide the FeederURL on all environments besides testnet and sepolia testnet
  • Added an exponential backoff timer to prevent handling nonce error too quickly

Sepolia Testnet: "https://alpha-sepolia.starknet.io/feeder_gateway"
Mainnet: "https://alpha-mainnet.starknet.io/feeder_gateway"
Integration: "https://external.integration.starknet.io/feeder_gateway"
Sepolia Integration: "https://integration-sepolia.starknet.io/feeder_gateway"
Goerli Testnet: "https://alpha4.starknet.io/feeder_gateway"
Goerli Integration: "https://external.integration.starknet.io"

Some less related changes

  • If you use a nethermind node, they set the x-apikey header you you have to add APIKey = '<>' in the Node TOML config
  • Removed chain id for nonce manager since there's only 1 nonce manager per chain

@augustbleeds augustbleeds changed the base branch from develop to fix_ci March 1, 2024 20:16
relayer/pkg/chainlink/txm/nonce.go Outdated Show resolved Hide resolved
@@ -64,6 +64,23 @@ func (nm *nonceManager) HealthReport() map[string]error {
return map[string]error{nm.Name(): nm.starter.Healthy()}
}

func (nm *nonceManager) Sync(ctx context.Context, address *felt.Felt, chainId string, client NonceManagerClient) error {
if err := nm.validate(address, chainId); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we could avoid locking/unlocking twice, eg by refactoring and adding a validateLocked function which requires the lock to be held as a precondition

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh nice, i didn't realize validate was also locking. i'll leave for now but good catch

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feeling like maybe this isn't a nit - could we handle this so it doesn't lock twice?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, i think it is better to leave it as is because confirmer and broadcaster both access the nonce manager concurrently. Therefore, even if you check that the lock is held in a hypothetical "validateLocked", you don't know whether the lock was held by the caller of "validateLocked" or by another goroutine.

If it's by another goroutine, then we could be reading the lock when the other goroutine is changing it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i could just always call validate() within the bounds of the caller's lock. maybe that's what you were referring to all along

relayer/pkg/chainlink/txm/txstore.go Show resolved Hide resolved
@cl-sonarqube-production
Copy link

Quality Gate failed Quality Gate failed

Failed conditions
36.8% Coverage on New Code (required ≥ 75%)

See analysis details on SonarQube

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants