Host/Service check scheduling
Ethan Galstad
nagios at nagios.org
Tue Apr 29 01:05:37 CEST 2003
Host checks are performed when:
1) A service changes state
2) A service is first checked and is found to be OK (if aggressive
checking is on)
3) A service is checked and remains in a non-OK state (if aggressive
checking is on)
They may also occur if an upstream or downstream host changes state.
In this case, host checks might be propagated up and down the
parenting tree to see if the states of other hosts have changed.
Based on your description, you need to choose another method of
checking your host. If 5 consecutive host checks are failing and the
host is really up, something needs to be reconfigured. IMHO, a 50
second failure for a host check should indicate that there's some
kind of problem.
On 28 Apr 2003 at 13:45, Denise Sandell wrote:
>
> Howdy,
>
> I have a question/concern about the way host and service checks are
> handled.
>
> Host checks are only supposed to be initiated when a service check
> fails. So at what point does the host check initiate, when a service
> returns a "soft" non-OK state, or a "hard" non-OK state? I've
> poured over the documentation and can't find anything that explicitly
> states what the function is.
>
> As far as I can tell operationally speaking, a host check is initiated
> immediately when a service check fails (ie a soft fail state). Yet,
> this is not clear from the event logs. I never see a failed/soft
> service check before a host check is initiated noted in the logs.
> As a result, my service checks fail once, a host check is initated,
> it counts through max attempts defined for a host, and then issues
> a notification.
>
> This is rather annoying since I would like to receive notifications
> for things down >5 minutes. My max_attempts for my host template is
> currently set at 5 (with an interval_length of 60), so once Nagios
> initiates host checks, it does so every 10 seconds 5 times in a row.
> As a result, I get notifications for things down for 50 seconds.
>
> The only way i've managed to get around this is to completely disable
> check_commands for hosts. This leads to some general retardedness for
> availability of hosts/services. If my service check for ping fails,
> the host goes down, but it is never marked as such by the system.
>
> So, the questions i have are these:
>
> . is this the intended function for network availability?
> . is there an undocumented way to apply check intervals to host checks?
> and if not, is this being considered as an option for future development?
>
> --
> - Denise Sandell network operations -
> - dsandell at voyager.net voyager.net, a CoreComm company -
>
>
>
> -------------------------------------------------------
> This sf.net email is sponsored by:ThinkGeek
> Welcome to geek heaven.
> http://thinkgeek.com/sf
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
Ethan Galstad,
Nagios Developer
---
Email: nagios at nagios.org
Website: http://www.nagios.org
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list