-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More error tolerant web ui check #1040
More error tolerant web ui check #1040
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what this code is trying to do.
WaitUntilWorking
tries a URL every 50ms until it succeeds or 10 seconds has passed. It seems the code adds another set of loops to retry on top of retries (which appears strange?) ... no?
My guessing of William hitting that 404 error is that the gin server wasn't fully started before we hit the health check endpoint (since the error is intermittent/random). Since 404 instead of 200 was returned, |
Gotcha. I would rather add another option to WaitUntilWorking to make the wrong response code non-fatal. Layers of retries are a bit scary. |
Sounds good. I'll refactor the code |
@haoming29 is this ready for review or should I wait a bit longer? |
Sorry I haven't get the chance to work on the refactoring. I will let you know when its' review for another round of review |
@jhiemstrawisc Should be good for the review. The Ubuntu test failure should be a random one and re-running it will likely fix the issue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few small typos, otherwise LGTM
Thanks for catching them! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
Should fix #1033
The suspicion is that there's a racing condition at the registry start somehow, and the web engine is not "ready-ready" to serve requests, although it could be other issues. However, this PR will make the web ui check more robust in case of the error code returned