Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky tests in hercules #176

Open
koxu1996 opened this issue Jul 30, 2024 · 3 comments
Open

Flaky tests in hercules #176

koxu1996 opened this issue Jul 30, 2024 · 3 comments

Comments

@koxu1996
Copy link
Contributor

It seems that tests are failing in Hercules randomly.

@marijanp
Copy link
Contributor

Could you provide a list which ones are flaky?
Also note that our CI machine is not very powerful, pushing to and updating multiple PR's will cause the system load to go above 300% which could cause some runtime delays with tests that spawn a casper network and await only a limited time for results until they fail/

@koxu1996
Copy link
Contributor Author

E2E tests are sometimes failing with Success key not found in JSON.

For example job #2179 (commit 2eda67a) failed first, then succeded when I trigerred "re-run effects and failures":

image

@marijanp
Copy link
Contributor

Yeah, if you scroll up in the logs you will see that the cctl service didn't start properly. cctl is a oneshot and this

let timed_out = start_time.elapsed().as_secs() > 90;
check will make it fail to start. This is due to the high demand of ressources especially if there are many builds running at the moment (We have a notification stream on zulip which reports the current system load of the CI machine).

The solution is to increase that timeout that was linkem, or reduce the amount of concurrent builds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants