-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
extend self-test log processing #151
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Aritas1 <[email protected]>
// assume the table will always be in descending order | ||
processedTypes := make(map[string]bool) | ||
|
||
for _, logEntry := range smart.json.Get("ata_smart_self_test_log.standard.table").Array() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should accept either standard
or extended
. Some args & device combinations only have one of them. The layout of the json struct is the same.
logTestType = "unknown" | ||
} | ||
|
||
if !processedTypes[logTestType] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is implicitly trusting that the tests appear in newest to oldest order. I don't know if I trust drives enough for that.
testTime = testTime * 60 * 60 | ||
|
||
// skip running tests | ||
if testRunningIndicator != 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not correct, from one of my systems:
"status": {
"value": 41,
"string": "Interrupted (host reset)",
"remaining_percent": 90
}
status.passeed
is NOT present in this case.
I don't have any SATA drives w/ failing checks to compare presentlyy, but I worry they are also non-zero.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, it's definetly in need of work; also in the smartctl sources:
std::string msgstat;
switch (test_status >> 4) {
case 0x0: msgstat = "Completed without error"; break;
case 0x1: msgstat = "Aborted by host"; break;
case 0x2: msgstat = "Interrupted (host reset)"; break;
case 0x3: msgstat = "Fatal or unknown error"; break;
case 0x4: msgstat = "Completed: unknown failure"; break;
case 0x5: msgstat = "Completed: electrical failure"; break;
case 0x6: msgstat = "Completed: servo/seek failure"; break;
case 0x7: msgstat = "Completed: read failure"; break;
case 0x8: msgstat = "Completed: handling damage??"; break;
case 0xf: msgstat = "Self-test routine in progress"; break;
default: msgstat = strprintf("Unknown status (0x%x)", test_status >> 4);
}
So if it's 0xF then skip it as running; otherwise map the error.
@@ -399,6 +401,50 @@ func (smart *SMARTctl) mineDeviceErrorLog() { | |||
} | |||
} | |||
|
|||
func (smart *SMARTctl) mineDeviceSelfTest() { | |||
validTypes := map[int]string{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from smartctl sources:
switch (test_type) {
case 0x00: msgtest = "Offline"; break;
case 0x01: msgtest = "Short offline"; break;
case 0x02: msgtest = "Extended offline"; break;
case 0x03: msgtest = "Conveyance offline"; break;
case 0x04: msgtest = "Selective offline"; break;
case 0x7f: msgtest = "Abort offline test"; break;
case 0x81: msgtest = "Short captive"; break;
case 0x82: msgtest = "Extended captive"; break;
case 0x83: msgtest = "Conveyance captive"; break;
case 0x84: msgtest = "Selective captive"; break;
default:
if ((0x40 <= test_type && test_type <= 0x7e) || 0x90 <= test_type)
msgtest = strprintf("Vendor (0x%02x)", test_type);
else
msgtest = strprintf("Reserved (0x%02x)", test_type);
}
this adds metrics for monitoring the latest self-tests execution time.
also fix the missing
smartctl_device_self_test_log_count
metric due to missing--log=selftest
argument.