parent/child setup not working
Andy Shellam (Mailing Lists)
andy.shellam-lists at mailnetwork.co.uk
Sat Jan 6 00:38:11 CET 2007
If I understand it right, your host checks should not be scheduled - but
your service checks are.
So, every time a service requires checking and Nagios finds the service
is down, it checks the host to see if the host is down. If it is, then
it suppresses notifications for the service and instead goes into the
host's notification handling.
However I'm not sure if this is the case for escalated service
notifications. You have a notification_interval set - try commenting
this out (or setting to 0) and see if you then get the same thing happening.
Andy.
David Miller wrote:
> Andy Shellam (Mailing Lists) wrote:
>
> Arghh! Sorry for the previous, content free reply.
>
> The service entry is;
>
> define service{
> use generic-service ; Name
> of service template to use
> hostgroup_name webservers
> service_description Check Simple Webservers
> is_volatile 0
> check_period 24x7
> max_check_attempts 5
> normal_check_interval 5
> retry_check_interval 2
> contact_groups ops
> notification_interval 120
> notification_period 24x7
> notification_options w,u,c,r
> check_command check_http
> }
>
> But the point is, unless I'm missing something, that the service
> should not be checked at all if the parent is down.
>
> Thanks!
>
> --- David
>
>> Hi David,
>>
>> I'm not clued up on parent/child relationships between hosts, however
>> one thing I believe might be happening is that the example of the
>> alert you've sent for the service - it might be a "reminder"
>> notification that the service is still down. (Perhaps as a result of
>> escalation settings?)
>>
>> I think this is because it has a delay in the state variable - ie.
>> "CRITICAL for xxxxx" as opposed to just "CRITICAL."
>>
>> What's the definition for that service?
>>
>> Andy.
>>
>>
>> David Miller wrote:
>>> Hi;
>>>
>>> I'm not sure what I'm doing wrong.
>>>
>>> Running nagios 2.5 on debian-stable. I have the nagios server in
>>> one data center monitoring 30ish servers in another data center.
>>>
>>> In the hosts.cfg file I have a gateway (firewall) defined:
>>>
>>> define host {
>>> use generic-host ; Name of host
>>> template to use
>>> host_name pix
>>> alias PIX
>>> address x.y.z.2
>>> check_command check-host-alive
>>> max_check_attempts 1
>>> notification_interval 1
>>> notification_period 24x7
>>> notification_options d,u,r
>>> }
>>>
>>>
>>> I then use that as a parent to all the hosts I want to monitor in
>>> the remote data center. Those have host entries like this;
>>>
>>>
>>> define host {
>>> use generic-host ; Name of host
>>> template to use
>>> host_name logweb1
>>> alias Logweb1
>>> address logweb1.foo.com
>>> parents pix
>>> max_check_attempts 1
>>> active_checks_enabled 0
>>> notification_interval 1
>>> notification_period 24x7
>>> notification_options d,r
>>> }
>>>
>>> As I read the documentation, when nagios detects that host "pix" is
>>> down that it won't check or report on host logweb1.
>>>
>>> If the network connection is broken, however, by deleting the
>>> default route, I get three messages that the pix is down that look
>>> like this:
>>>
>>> Subject:** PROBLEM alert 1 - PIX host is DOWN **
>>>
>>> ***** Nagios *****
>>>
>>> Notification Type: PROBLEM
>>> Host: PIX
>>> State: DOWN for 0d 0h 0m 0s
>>> Address: 66.151.232.2
>>> Info:
>>>
>>> CRITICAL - Network unreachable (x.y.z.2)
>>>
>>> Date/Time: Fri Jan 5 16:17:48 EST 2007
>>>
>>> ACK by: Comment:
>>>
>>> And a few minutes later I get notice on the child server:
>>>
>>> Subject: ** PROBLEM alert 1 - Logweb1/Check Simple Webservers is
>>> CRITICAL **
>>>
>>> ***** Nagios *****
>>>
>>> Notification Type: PROBLEM
>>>
>>> Service: Check Simple Webservers
>>> Host: Logweb1
>>> State: CRITICAL for 0d 0h 8m 6s
>>> Address: logweb1.foo.com
>>>
>>> Info:
>>>
>>> Network is unreachable
>>>
>>> Date/Time: Fri Jan 5 16:29:28 EST 2007
>>>
>>> ACK by: Comment:
>>>
>>> What am I doing wrong?
>>>
>>> Thanks in advance,
>>>
>>> --- David
>>>
>>>
>>>
>>> -------------------------------------------------------------------------
>>>
>>> Take Surveys. Earn Cash. Influence the Future of IT
>>> Join SourceForge.net's Techsay panel and you'll get the chance to
>>> share your
>>> opinions on IT & business topics through brief surveys - and earn cash
>>> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
>>>
>>> _______________________________________________
>>> Nagios-users mailing list
>>> Nagios-users at lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/nagios-users
>>> ::: Please include Nagios version, plugin version (-v) and OS when
>>> reporting any issue. ::: Messages without supporting info will risk
>>> being sent to /dev/null
>>>
>>>
>>>
>>>
>>>
>>
>>
>
>
> !DSPAM:37,459ee03d137101012410913!
>
>
--
Andy Shellam
NetServe Support Team
the Mail Network
"an alternative in a standardised world"
p: +44 (0) 121 288 0832/0839
m: +44 (0) 7818 000834
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list