Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Greatly improve connection resets on DoIP and miscellaneous fixes throughout #610

Merged
merged 10 commits into from
Oct 23, 2024

Conversation

ferdinandjarisch
Copy link
Contributor

This PR is a collection of multiple fixes and small adaptions at different levels.

Most notably, it greatly improves the handling of ConnectionErrors, especially for DoIP, rendering our transport stack way more resilient against connection losses during, e.g., planned reconnects or unexpected ECU crashes.

@ferdinandjarisch ferdinandjarisch force-pushed the doip-hotfixes branch 5 times, most recently from 0320ce4 to 97d57d8 Compare October 23, 2024 06:48
Tasks that terminate with error will produce a stack trace that will be
displayed once the interpreter finished. This commit ensures that we
consume the stack trace and generate a log message when it happens
instead.
Currently we never call wait_for_ecu with a explicit timeout value but
instead default on self.timeout.
This is unexpected and can be confusing, so instead default to 10s,
which is 5x more than the current default of 2s coming from self.timeout
UDSExceptions do not require to reconnect the transport.
Currently max_retry already counts the initial request as well, which is
counter-intuitive.
This commit effectively turns retries into retries, meaning a value for
max_retry of 3 would result in a total of 4 requests sent to the target.
The TCP connection of DoIP can close midway through a request,
currently resulting in broken states. This commit gracefully handles
ConnectionErrors while waiting for a response within the retry loop,
such that recoverable ConnectionErrors are simply recovered.
This commit adds a loop to the reconnect function to be able to recover
also when the very first reconnection-attempt is unsuccessful.
The loop is only active, however, when a timeout value is set, which for
now will only be the case for DoIP.
@ferdinandjarisch ferdinandjarisch merged commit 17408d9 into master Oct 23, 2024
12 checks passed
@ferdinandjarisch ferdinandjarisch deleted the doip-hotfixes branch October 23, 2024 07:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants