Skip to content

Commit

Permalink
Add systemd service crashlooping alert (#30)
Browse files Browse the repository at this point in the history
* Add NodeSystemdServiceCrashlooping alert

Signed-off-by: Vitaly Zhuravlev <[email protected]>

* Fix typo

Signed-off-by: Vitaly Zhuravlev <[email protected]>

---------

Signed-off-by: Vitaly Zhuravlev <[email protected]>
  • Loading branch information
v-zhuravlev authored Jun 3, 2024
1 parent dc6a1e8 commit e8deda2
Showing 1 changed file with 14 additions and 0 deletions.
14 changes: 14 additions & 0 deletions docs/node-observ-lib/linux/alerts.libsonnet
Original file line number Diff line number Diff line change
Expand Up @@ -414,6 +414,20 @@
description: 'Systemd service {{ $labels.name }} has entered failed state at {{ $labels.instance }}',
},
},
{
alert: 'NodeSystemdServiceCrashlooping',
expr: |||
increase(node_systemd_service_restart_total{%(filteringSelector)s}[5m]) > 2
||| % this.config,
'for': '15m',
labels: {
severity: 'warning',
},
annotations: {
summary: 'Systemd service keeps restaring, possibly crash looping.',
description: 'Systemd service {{ $labels.name }} has been restarted too many times at {{ $labels.instance }} for the last 15 minutes. Please check if service is crash looping.',
},
},
]
+ if this.config.enableHardware then
[{
Expand Down

0 comments on commit e8deda2

Please sign in to comment.