Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent UI lockup caused by PubSub race condition #9

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jonny12375
Copy link

When running tests multiple times, occasionally we see an error "HTTP request for phase descriptors failed." in the web UI when a test completes, preventing us from starting the test again.

From some sleuthing in the web ui console, I noticed that the PubSub websocket publishes (what it believes to be) the test UID of the currently running test. The web ui then requests tests/<test_uid>/phases, which should return the phase info for that test. However, when we see this error, I noticed that it's sending the request of the previous test. The python only stores the information of the currently running test, and so the request for the previous test info fails.

The issue here (I believe) is that when a test is finishing and a new test is started, the event for the previous test finishing arrives after the notification for the new test starting. This means we store the UID of the previous test in StationPubSub, and incorrectly report that test UID to the webui, causing it to request the phase info for the old test.

My hacky workaround for this is to only allow changing the UID for the test when we've never seen the test UID before. If we see an update for an old test, we refuse to update the uid.

@jonny12375
Copy link
Author

Damn, I've managed to still encounter the error 😞
Will keep investigating as this affects us a lot

@thealastair
Copy link

Damn, I've managed to still encounter the error 😞 Will keep investigating as this affects us a lot

Should we just change the front end to re-request the current id when this failure occurs? I can have a look at this after the festivities.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants