More on notifications and reboot monitoring
Carson Gaspar
carson+nagiosusers at taltos.org
Wed Jan 5 23:48:50 CET 2005
I think I have discovered the cause of all of my problems:
Notifications are only ever triggered by check results
Life would have been so much easier if that were documented. So in my
environment, which is passive only for scalability reasons, if a host goes
down and stays down the only checks that will ever trigger notifications
are pings (as they run centrally) and freshness checks.
So in order to do reboot monitoring, my choices are limited (without
writing active agents). I _think_ this should work - comments?
- On shutdown, start 30 minute (tweak to taste) scheduled downtime for Ping
(so ping won't whine about the rebooting host being down)
- On shutdown, send a passive Reboot_Down CRIT, but Reboot_Down doesn't
notify anyone
- On startup, send a passive Reboot_Up CRIT. Reboot_Up depends on
Reboot_Down, so if the server was shut down cleanly, no notification will
go out.
- On startup, send a passive Reboot_Up OK followed by a Reboot_Down OK
- Reboot_Up and Reboot_Down have freshness checks disabled.
I'd cancel the downtime if I could on startup, but there's no good way to
get the downtime ID remotely. I could write an agent that runs on the
nagios server if I decided I really cared.
So on a normal reboot, no alarms. On a reboot that never comes back, ping
will alarm after the downtime ends. On an abnormal reboot, Reboot_Up will
alarm (as Reboot_Down will be OK (or Unknown)).
--
Carson
-------------------------------------------------------
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list