Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LTM shard monitoring timeouts on a per-test basis #10

Open
tytso opened this issue Jun 5, 2018 · 0 comments
Open

LTM shard monitoring timeouts on a per-test basis #10

tytso opened this issue Jun 5, 2018 · 0 comments

Comments

@tytso
Copy link
Owner

tytso commented Jun 5, 2018

The shard monitoring in the LTM currently assumes a fixed timeout of 1 hour between status updates on the child test appliance. If the last status update occurred more than an hour ago, the monitor process will assume that the test appliance crashed/wedged, and will create a serial port dump in place of test results.
The reason for setting a fixed 1 hour timeout is that generic/027 and a few other tests in xfstests are very IOPS bound, and take a while (some runs take longer than 3000 seconds)

If the test appliance were more diligent in reporting the latest test being run, custom timeouts could be set for each test in xfstests, and a kernel crash would be detected much sooner. For example, generic/001 is usually quite fast to run, so if the LTM is aware that the test appliance is running generic/001, the timeout could be somewhere in the range of 20-30 seconds rather than the fixed hour.

It could be also estimated that tests fall into several categories of size, e.g. "xsmall", "small", "medium", "large", and "xlarge"

To be even more sophisticated, the timeouts could be modified based on the number of CPUs/size of the scratch disk of that particular test appliance, and whether the test is more CPU/IOPS bound.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant