What the...
Russell Scibetti
russell at quadrix.com
Thu Oct 10 20:40:32 CEST 2002
The only time nagios will stop doing service checks at the
normal_check_interval for that service is if that service has a
servicedependency that's execution failure criteria is true.
Otherwise, service checks will continue as planned. The way nagios
knows that a host has come back up is if any service on that host has
recovered to OK. While a host and its services are down, when a service
check occurs, it won't go through all the retries (already in a hard
state - no need to retry), but it will check the service once,
Also, do you have aggressive_host_checking enabled in your nagios.cfg?
The only reason I can guess that the host check is also occurring when
the service check occurs is that you have that setting enabled.
Otherwise a host will only get checked after the first service check
failure (when the host is still up).
Hope this helps.
-Russell
Bishop, Dean wrote:
> First, sorry bout the subject i realize that it is inappropriate. it
> does, however capture my initial response.
>
> We are in the midst of many nightmares concurrently: smoking servers,
> irreplaceable data lost, network latency, cold lunch, sore finger, you
> know the whole gambut at once.
>
> apologies to all.
>
> here is another entry from my logs. Each host is dependant on the
> previously numbered host (e.g. Marshall-McLuhan-0561SW2A_4-HS7 is the
> parent of Marshall-McLuhan-0561SW2A_5-HS7 who is the parent of
> Marshall-McLuhan-0561SW2A_6-HS7, etc.
>
> why, once Marshall-McLuhan-0561SW2A_14-HS7 is determined to be
> UNREACHABLE (due to the failure of Marshall-McLuhan-0561SW2A_4-HS7),
> is the service checked on Marshall-McLuhan-0561SW2A_14-HS7?
>
>
>
> [1034172479] HOST ALERT:
> Marshall-McLuhan-0561SW2A_14-HS7;DOWN;SOFT;1;CRITICAL - Plugin timed
> out after 18 seconds
> [1034172516] HOST ALERT:
> Marshall-McLuhan-0561SW2A_7-HS7;DOWN;SOFT;1;CRITICAL - Plugin timed
> out after 18 seconds
> [1034172552] HOST ALERT:
> Marshall-McLuhan-0561SW2A_6-HS7;DOWN;SOFT;1;CRITICAL - Plugin timed
> out after 18 seconds
> [1034172588] HOST ALERT:
> Marshall-McLuhan-0561SW2A_5-HS7;DOWN;SOFT;1;CRITICAL - Plugin timed
> out after 18 seconds
> [1034172624] HOST ALERT:
> Marshall-McLuhan-0561SW2A_4-HS7;DOWN;SOFT;1;CRITICAL - Plugin timed
> out after 18 seconds
> [1034172644] HOST ALERT:
> Marshall-McLuhan-0561SW2A_4-HS7;DOWN;HARD;2;CRITICAL - Plugin timed
> out after 18 seconds
> [1034172644] HOST NOTIFICATION:
> nagiosadmin;Marshall-McLuhan-0561SW2A_4-HS7;DOWN;host-notify-by-email;CRITICAL
> - Plugin timed out after 18 seconds
> [1034172645] HOST NOTIFICATION:
> Marco;Marshall-McLuhan-0561SW2A_4-HS7;DOWN;host-notify-by-email;CRITICAL
> - Plugin timed out after 18 seconds
> [1034172645] HOST NOTIFICATION:
> Kevin-NonCritical;Marshall-McLuhan-0561SW2A_4-HS7;DOWN;notify-by-epager;CRITICAL
> - Plugin timed out after 18 seconds
> [1034172645] HOST NOTIFICATION:
> Kevin;Marshall-McLuhan-0561SW2A_4-HS7;DOWN;host-notify-by-email;CRITICAL
> - Plugin timed out after 18 seconds
> [1034172646] HOST NOTIFICATION:
> Keith-NonCritical;Marshall-McLuhan-0561SW2A_4-HS7;DOWN;notify-by-epager;CRITICAL
> - Plugin timed out after 18 seconds
> [1034172646] HOST NOTIFICATION:
> Keith;Marshall-McLuhan-0561SW2A_4-HS7;DOWN;host-notify-by-email;CRITICAL
> - Plugin timed out after 18 seconds
> [1034172646] HOST NOTIFICATION:
> Ben;Marshall-McLuhan-0561SW2A_4-HS7;DOWN;host-notify-by-email;CRITICAL
> - Plugin timed out after 18 seconds
> [1034172647] HOST ALERT:
> Marshall-McLuhan-0561SW2A_5-HS7;UNREACHABLE;HARD;2;CRITICAL - Plugin
> timed out after 18 seconds
> [1034172647] HOST ALERT:
> Marshall-McLuhan-0561SW2A_6-HS7;UNREACHABLE;HARD;2;CRITICAL - Plugin
> timed out after 18 seconds
> [1034172647] HOST ALERT:
> Marshall-McLuhan-0561SW2A_7-HS7;UNREACHABLE;HARD;2;CRITICAL - Plugin
> timed out after 18 seconds
> [1034172647] HOST ALERT:
> Marshall-McLuhan-0561SW2A_14-HS7;UNREACHABLE;HARD;2;CRITICAL - Plugin
> timed out after 18 seconds
> [1034172647] SERVICE ALERT: Marshall-McLuhan-0561SW2A_14-HS7;Port
> Check-23;CRITICAL;HARD;1;Socket timeout after 10 seconds
>
>
> -----Original Message-----
> From: Bishop, Dean
> Sent: Thursday, October 10, 2002 1:04 PM
> To: 'nagios-users at lists.sourceforge.net'
> Subject: What the *&#( !!
> Importance: High
>
>
> Can someone explain this to me??
>
>
> why in the world is the service for testserver01.tcdsb.org being
> checked after the host has been determined down?
> also why is the host being checked before the service??
>
>
>
>
> [root at NMS var]# tail nagios.log -n 3000 |grep testserver01
>
> [1034266896] HOST ALERT: testserver01.tcdsb.org;UP;HARD;1;(Host
> assumed to be up)
> [1034266896] SERVICE ALERT: testserver01.tcdsb.org;Misc Servers - Port
> Check 135;OK;HARD;1;TCP OK - 0 second response time on port 135
> [1034267924] HOST ALERT: testserver01.tcdsb.org;DOWN;SOFT;1;CRITICAL -
> Plugin timed out after 8 seconds
> [1034267933] HOST ALERT: testserver01.tcdsb.org;DOWN;HARD;2;CRITICAL -
> Plugin timed out after 8 seconds
> [1034267933] HOST
> NOTIFICATION:nagiosadmin;testserver01.tcdsb.org;DOWN;host-notify-by-email;CRITICAL
> - Plugin timed out after 8 seconds
> [1034267934] HOST
> NOTIFICATION:Keith;testserver01.tcdsb.org;DOWN;host-notify-by-email;CRITICAL
> - Plugin timed out after 8 seconds
> [1034267934] SERVICE ALERT: testserver01.tcdsb.org;Misc Servers - Port
> Check 135;CRITICAL;HARD;1;Socket timeout after 2 seconds
> [1034268938] HOST ALERT: testserver01.tcdsb.org;UP;HARD;1;PING OK -
> Packet loss = 0%, RTA = 0.61 ms
> [1034268938] HOST
> NOTIFICATION:nagiosadmin;testserver01.tcdsb.org;UP;host-notify-by-email;PING
> OK - Packet loss = 0%, RTA = 0.61 ms
> [1034268938] HOST
> NOTIFICATION:Keith;testserver01.tcdsb.org;UP;host-notify-by-email;PING
> OK - Packet loss = 0%, RTA = 0.61 ms
> [1034268938] SERVICE ALERT: testserver01.tcdsb.org;Misc Servers - Port
> Check 135;OK;HARD;1;TCP OK - 0 second response time on port 135
>
> [root at NMS var]#
>
--
Russell Scibetti
Quadrix Solutions, Inc.
http://www.quadrix.com
(732) 235-2335, ext. 7038
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20021010/8b035878/attachment.html>
More information about the Users
mailing list