Host Check Clarification
Bishop, Dean
dean.bishop at tcdsb.org
Tue Oct 22 14:05:52 CEST 2002
The way to do this is to use the retry_check_interval and
max_check_attempts.
upon failure (this applies to both services and hosts) the
normal_check_interval is not used. Rather the retry_check_interval is used.
The host/service will not become hard Non-OK until/unless max_check_attempts
is reached without getting an OK result.
so to avoid notifications for temporary outages, retry [max_check_attempts]
times every [retry_check_interval] minutes.
hope this helps,
dean
-----Original Message-----
From: Kevin Miller [mailto:kmiller at inflow.com]
Sent: Monday, October 21, 2002 6:53 PM
To: 'Bishop, Dean'
Subject: RE: [Nagios-users] Host Check Clarification
Thanks, that is what I assumed. What I am actually looking for is a way to
suppress host down alerts from notifying me so quickly. I am monitoring
hosts across the internet and therefore cannot control everything. Very
often there will be a temporary routing problem that will clear up after 1
or 2 mins. I would like nagios to keep trying for a few mins before paging
me.
Any ideas?
Thanks
-----Original Message-----
From: Bishop, Dean [mailto:dean.bishop at tcdsb.org]
Sent: Monday, October 21, 2002 3:03 PM
To: 'Kevin Miller '; 'nagios-users at lists.sourceforge.net '
Subject: RE: [Nagios-users] Host Check Clarification
i am away from my docs right now but here is how it works.
if the a service check, any service check (this would include the first of
many) returns a Non-OK status, then the host is checked.
if the host checks OK, then the services are scheduled for check using the
service's check_retry_interval. If the service stays Non-OK until
max_attempts, then the service notification is sent.
if the host check is Non-OK, then the host is pounded. If it stays Non-OK
until max_attempts (for the host) then the host notification is sent.
under both of these circumstances the service is now rescheduled at its
normal_check_interval.
the difference is that if the host is down, then service notifications are
squelched.
later,
dean
-----Original Message-----
From: Kevin Miller
To: nagios-users at lists.sourceforge.net
Sent: 10/21/2002 4:14 PM
Subject: [Nagios-users] Host Check Clarification
Looking for some clarification on Nagios Host checking. I am monitoring
the SSH service on multiple hosts, from what I understand when the SSH
service check has problems, Nagios then tries to do a Host check.
>From the documentation
"One instance where Nagios checks the status of a host is when a service
check results in a non-OK status. Nagios checks the host to decide
whether or not the host is up, down, or unreachable. If the first host
check returns a non-OK state, Nagios will keep pounding out checks of
the host until either (a) the maximum number of host checks (specified
by the max_attempts option in the host definition) is reached or (b) a
host check results in an OK state. "
The documentation states that Nagios dedicates all resources to checking
this host and then sends a notification that the host is down. The part
that seems a little strange to me is that often I will get a Host Down
notification while Nagios is still doing test 1 out of 3 for the SSH
service. I have my max_attempts set to 10 for each host, what is the
interval between these attempts?. Is there anyway to tell Nagios to
perform host checks that are a certain interval apart (just like in
service checks) before sending a notification?
Thanks
-------------------------------------------------------
This sf.net emial is sponsored by: Influence the future of
Java(TM) technology. Join the Java Community Process(SM) (JCP(SM))
program now. http://ad.doubleclick.net/clk;4699841;7576301;v?
http://www.sun.com/javavote
More information about the Users
mailing list