Skip to content

Commit

Permalink
Add NodeSystemdServiceCrashlooping alert
Browse files Browse the repository at this point in the history
Signed-off-by: Vitaly Zhuravlev <[email protected]>
  • Loading branch information
v-zhuravlev committed Jun 3, 2024
1 parent dc6a1e8 commit 395c7a1
Showing 1 changed file with 14 additions and 0 deletions.
14 changes: 14 additions & 0 deletions docs/node-observ-lib/linux/alerts.libsonnet
Original file line number Diff line number Diff line change
Expand Up @@ -414,6 +414,20 @@
description: 'Systemd service {{ $labels.name }} has entered failed state at {{ $labels.instance }}',
},
},
{
alert: 'NodeSystemdServiceCrashlooping',
expr: |||
increase(node_systemd_service_restart_total{%(filteringSelector)s}[5m]) > 2
||| % this.config,
'for': '15m',
labels: {
severity: 'warning',
},
annotations: {
summary: 'Systemd service keeps restaring, possibly crash looping.',
description: 'Systemd service {{ $labels.name }} has being restarted too many times at {{ $labels.instance }} for the last 15 minutes. Please check if service is crash looping.',
},
},
]
+ if this.config.enableHardware then
[{
Expand Down

0 comments on commit 395c7a1

Please sign in to comment.