controlling host check sensitivity
Brian Snead
BSnead at infosysnetworks.com
Tue Jul 22 22:45:37 CEST 2003
I need some ideas on slowing down the notification process when a service fails. After reading and re-reading the docs, it looks like Nagios immediately starts ping the host if a service fails. It does this rapidfire up to the max_check_attempts setting. Each check takes about 10 seconds, so if you have max_check_attempts set to 10, it take 100 seconds before sending a notification.
The sites I am monitoring are connected via microwave and are subject to rain fade, etc. The host is not really down. I have all the parents setup, but the upstream hosts (routers) have not failed yet. What I want to do is to give the hosts 3 -5 minutes to recover before sending notifications.
Anyone have an idea?
One idea was to set the max_check_attempts to 30. But no other checks execute until the state of the host is determined. Once it recovers or fails finally, all my other service checks show as stale and then immediately execute causing other problems.
Please send me your ideas.
Brian.
-------------------------------------------------------
This SF.net email is sponsored by: VM Ware
With VMware you can run multiple operating systems on a single machine.
WITHOUT REBOOTING! Mix Linux / Windows / Novell virtual machines at the
same time. Free trial click here: http://www.vmware.com/wl/offer/345/0
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list