Skip to content
andreasn edited this page Dec 18, 2013 · 9 revisions

Objectives

  • Get alerted when things go bad.
  • Don't depend on having to watch the machine 24/7
  • Integrate with the journal
  • Easy accessible

Tentative Design

  • Notification area
  • Every message goes through the journal. Only the ones of a certain level are consithered important enough to trigger a notification event (inside cockpit or via email/xmpp).
  • This however requires that by default a system do not spit out a lot of stuff by default to the journal on that level.

How to alert

  • Email and XMPP

What to alert for

  • Too high CPU/Memory/Disk IO
  • Dying disks
  • Crashing services?
  • Power failure?
  • Failing Internet connection?
  • Packet loss?

Relevant Art

http://cdn.arstechnica.net/wp-content/uploads/2013/11/Screen-Shot-2013-10-25-at-2.57.59-AM.png

Comments

See Also

Clone this wiki locally