service notification when host is down
Samuel Bancal
sam.bancal at gmail.com
Thu Feb 18 10:47:52 CET 2010
Thanks for your answer,
In fact it is normal behavior to me also.
Thing that is not "normal behavior" to me is that between two checks, Nagios
jumps from "SOFT 1" to "HARD 1" without doing the steps "SOFT 1" > "SOFT 2"
> "SOFT 3" and finally "HARD 4".
Regards,
Samuel Bancal
2010/2/17 Morris, Patrick <patrick.morris at hp.com>
> Samuel Bancal wrote:
>
>> Nagios Core 3.2.0
>> nagios-plugins-1.4.14
>> Ubuntu server 8.04.3 LTS
>>
>> Hi,
>>
>> I'm encountering problems to configure the notifications in case a server
>> is no more responding to PING (ICMP).
>> I don't understand why Nagios is jumping over steps when it's doing
>> service-check "icmp".
>> Here is the config :
>>
>> define host{
>> use generic-server
>> host_name server1
>> alias server1
>> address the.ip.the.ip
>> hostgroups prod-servers
>> contact_groups group1
>> check_command check-host-alive
>> check_period 24x7
>> check_interval 5
>> retry_interval 1
>> max_check_attempts 4
>> notification_period 24x7
>> notification_interval 60
>> notification_options d,u,r
>> }
>>
>> define service{
>> use generic-service
>> host_name server1
>> service_description ICMP
>> check_command check_icmp!100.0,20%!500.0,60%
>> max_check_attempts 4
>> normal_check_interval 5
>> retry_check_interval 1
>> notification_options w,u,c,r
>> notification_interval 60
>> notification_period 24x7
>> }
>> [...]
>> define command{
>> command_name check-host-alive
>> command_line $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c
>> 5000.0,100% -p 5
>> }
>> define command{
>> command_name check_icmp
>> command_line $USER1$/check_icmp -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$
>> -p 5
>> }
>> [...]
>>
>> Here is an example of history that I get :
>> Service Critical[2010-02-16 11:33:13] SERVICE ALERT:
>> server1;ICMP;CRITICAL;SOFT;1;CRITICAL - the.ip.the.ip: rta nan, lost 100%
>> Host Down[2010-02-16 11:33:43] HOST ALERT: server1;DOWN;SOFT;1;(Host Check
>> Timed Out)
>> Service Critical[2010-02-16 11:34:13] SERVICE ALERT:
>> server1;ICMP;CRITICAL;HARD;1;CRITICAL - the.ip.the.ip: rta nan, lost 100%
>> Host Down[2010-02-16 11:34:43] HOST ALERT: server1;DOWN;SOFT;2;(Host Check
>> Timed Out)
>> Host Down[2010-02-16 11:35:23] HOST ALERT: server1;DOWN;SOFT;3;(Host Check
>> Timed Out)
>> Host Down[2010-02-16 11:36:33] HOST ALERT: server1;DOWN;HARD;4;(Host Check
>> Timed Out)
>> Host Up[2010-02-16 11:37:43] HOST ALERT: server1;UP;HARD;1;PING OK -
>> Packet loss = 0%, RTA = 0.67 ms
>> Service Ok[2010-02-16 11:39:13] SERVICE ALERT: server1;ICMP;OK;HARD;1;OK -
>> the.ip.the.ip: rta 0.943ms, lost 0%
>>
>> Or later :
>> Host Down[2010-02-16 11:42:03] HOST ALERT: server1;DOWN;SOFT;1;(Host Check
>> Timed Out)
>> Host Down[2010-02-16 11:43:13] HOST ALERT: server1;DOWN;SOFT;2;(Host Check
>> Timed Out)
>> Service Critical[2010-02-16 11:44:13] SERVICE ALERT:
>> server1;ICMP;CRITICAL;HARD;1;CRITICAL - the.ip.the.ip: rta nan, lost 100%
>> Host Down[2010-02-16 11:44:43] HOST ALERT: server1;DOWN;SOFT;3;(Host Check
>> Timed Out)
>> Host Up[2010-02-16 11:45:53] HOST ALERT: server1;UP;SOFT;4;PING OK -
>> Packet loss = 0%, RTA = 0.64 ms
>> Service Ok[2010-02-16 11:49:13] SERVICE ALERT: server1;ICMP;OK;HARD;1;OK -
>> the.ip.the.ip: rta 0.948ms, lost 0%
>>
>
> If you're asking why Nagios runs a host check when it sees the service fail
> a check, that's normal behavior.
>
> When a service check fails, the first thing Nagios will do is look to see
> if the service failed because the host is down.
>
--
Samuel Bancal - CH
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20100218/7f987e7b/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list