-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data Diagnosis command is not idempotent when diagnosis rule file uses failure_check function #626
Comments
Note, upon further testing, this behavior inconsistency can also be triggered even if there's no usage of failure_check rules in the diagnosis rule file. |
Hi, thanks for reaching us! |
Hi @jorgeesg, do you still have this issue? If not, we will close it. |
Hi Jorge, we have merged PR #638 to fix the issue, this issue is going to be close, please let us know if you have more questions. |
Thank you very much for the support and help :) |
What's the issue, what's expected?:
Given a baseline file and a diagnosis rule file, the generated diagnosis_summary report varies between executions.
The inconsistent diagnosis behavior occurs when using the "failure_check" function in the diagnosis rule file.
How to reproduce it?:
Logs and snapshots:
When return code metrics are properly used by the data diagnosis process, you will always see two log messages, like in this picture. One for data_diagnosis.py line 265, and the other one for line 330.
In contrast, this second image shows how logs look like when superbench does not correctly use the return code diagnosis rules. It will just mark all nodes as bad, using all the return code metrics.
Additional information:
SB version - 0.10
This bad behavior's current workaround is to NOT use the "failure_check" function and instead replace it with "value". However, users using "failure_check" may be unaware of this behavior.
The text was updated successfully, but these errors were encountered: