More on notifications and reboot monitoring
Andreas Ericsson
ae at op5.se
Thu Jan 6 14:51:20 CET 2005
Carson Gaspar wrote:
> I think I have discovered the cause of all of my problems:
>
> Notifications are only ever triggered by check results
>
> Life would have been so much easier if that were documented. So in my
> environment, which is passive only for scalability reasons, if a host
> goes down and stays down the only checks that will ever trigger
> notifications are pings (as they run centrally) and freshness checks.
>
> So in order to do reboot monitoring, my choices are limited (without
> writing active agents). I _think_ this should work - comments?
>
> - On shutdown, start 30 minute (tweak to taste) scheduled downtime for
> Ping (so ping won't whine about the rebooting host being down)
> - On shutdown, send a passive Reboot_Down CRIT, but Reboot_Down doesn't
> notify anyone
> - On startup, send a passive Reboot_Up CRIT. Reboot_Up depends on
> Reboot_Down, so if the server was shut down cleanly, no notification
> will go out.
> - On startup, send a passive Reboot_Up OK followed by a Reboot_Down OK
> - Reboot_Up and Reboot_Down have freshness checks disabled.
>
> I'd cancel the downtime if I could on startup, but there's no good way
> to get the downtime ID remotely. I could write an agent that runs on the
> nagios server if I decided I really cared.
>
> So on a normal reboot, no alarms. On a reboot that never comes back,
> ping will alarm after the downtime ends. On an abnormal reboot,
> Reboot_Up will alarm (as Reboot_Down will be OK (or Unknown)).
>
Ehrm. The idea of scheduled downtime is to do this sort of thing. If you
want to add a script submitting a 5 minute (or something) downtime
whenever you run reboot, then by all means feel free. If you make it
clean I'm sure lots of other users would be interested. I don't think
it's a very good idea to keep that logic in the Nagios daemon though, as
it can never possibly guess if a host has been shut down or crashed, so
I don't quite see the point of this email. Care to clarify?
--
Andreas Ericsson andreas.ericsson at op5.se
OP5 AB www.op5.se
Lead Developer
-------------------------------------------------------
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list