parent/child setup not working
David Miller
nagios at d.sparks.net
Sat Jan 6 02:51:28 CET 2007
Andy Shellam (Mailing Lists) wrote:
> If I understand it right, your host checks should not be scheduled -
> but your service checks are.
> So, every time a service requires checking and Nagios finds the
> service is down, it checks the host to see if the host is down. If it
> is, then it suppresses notifications for the service and instead goes
> into the host's notification handling.
That makes some sense - all the hosts with host checks don't send
unwanted notices.
I'm still missing something though - unless the design is that:
host checks aren't executed if the parent is down, but if no host check
is specified the host is still presumed up.
and
service checks are performed as long as the host is known or presumed to
be up.
But that would makes the parent relationship specification fairly useless.
>
> However I'm not sure if this is the case for escalated service
> notifications. You have a notification_interval set - try commenting
> this out (or setting to 0) and see if you then get the same thing
> happening.
It's a required field, so commenting it out doesn't work. I set it to
0, deleted the default route, and got the same result; Three notices
that the pix was down, a five minute wait, and a notice that this host
was down.
--- David
>
> Andy.
>
>
> David Miller wrote:
>> Andy Shellam (Mailing Lists) wrote:
>>
>> Arghh! Sorry for the previous, content free reply.
>>
>> The service entry is;
>>
>> define service{
>> use generic-service ; Name
>> of service template to use
>> hostgroup_name webservers
>> service_description Check Simple Webservers
>> is_volatile 0
>> check_period 24x7
>> max_check_attempts 5
>> normal_check_interval 5
>> retry_check_interval 2
>> contact_groups ops
>> notification_interval 120
>> notification_period 24x7
>> notification_options w,u,c,r
>> check_command check_http
>> }
>> But the point is, unless I'm missing something, that the service
>> should not be checked at all if the parent is down.
>>
>> Thanks!
>>
>> --- David
>>
>>> Hi David,
>>>
>>> I'm not clued up on parent/child relationships between hosts,
>>> however one thing I believe might be happening is that the example
>>> of the alert you've sent for the service - it might be a "reminder"
>>> notification that the service is still down. (Perhaps as a result
>>> of escalation settings?)
>>>
>>> I think this is because it has a delay in the state variable - ie.
>>> "CRITICAL for xxxxx" as opposed to just "CRITICAL."
>>>
>>> What's the definition for that service?
>>>
>>> Andy.
>>>
>>>
>>> David Miller wrote:
>>>> Hi;
>>>>
>>>> I'm not sure what I'm doing wrong.
>>>>
>>>> Running nagios 2.5 on debian-stable. I have the nagios server in
>>>> one data center monitoring 30ish servers in another data center.
>>>>
>>>> In the hosts.cfg file I have a gateway (firewall) defined:
>>>>
>>>> define host {
>>>> use generic-host ; Name of host
>>>> template to use
>>>> host_name pix
>>>> alias PIX
>>>> address x.y.z.2
>>>> check_command check-host-alive
>>>> max_check_attempts 1
>>>> notification_interval 1
>>>> notification_period 24x7
>>>> notification_options d,u,r
>>>> }
>>>>
>>>>
>>>> I then use that as a parent to all the hosts I want to monitor in
>>>> the remote data center. Those have host entries like this;
>>>>
>>>>
>>>> define host {
>>>> use generic-host ; Name of host
>>>> template to use
>>>> host_name logweb1
>>>> alias Logweb1
>>>> address logweb1.foo.com
>>>> parents pix
>>>> max_check_attempts 1
>>>> active_checks_enabled 0
>>>> notification_interval 1
>>>> notification_period 24x7
>>>> notification_options d,r
>>>> }
>>>>
>>>> As I read the documentation, when nagios detects that host "pix" is
>>>> down that it won't check or report on host logweb1.
>>>>
>>>> If the network connection is broken, however, by deleting the
>>>> default route, I get three messages that the pix is down that look
>>>> like this:
>>>>
>>>> Subject:** PROBLEM alert 1 - PIX host is DOWN **
>>>>
>>>> ***** Nagios *****
>>>>
>>>> Notification Type: PROBLEM
>>>> Host: PIX
>>>> State: DOWN for 0d 0h 0m 0s
>>>> Address: 66.151.232.2
>>>> Info:
>>>>
>>>> CRITICAL - Network unreachable (x.y.z.2)
>>>>
>>>> Date/Time: Fri Jan 5 16:17:48 EST 2007
>>>>
>>>> ACK by: Comment:
>>>>
>>>> And a few minutes later I get notice on the child server:
>>>>
>>>> Subject: ** PROBLEM alert 1 - Logweb1/Check Simple Webservers is
>>>> CRITICAL **
>>>>
>>>> ***** Nagios *****
>>>>
>>>> Notification Type: PROBLEM
>>>>
>>>> Service: Check Simple Webservers
>>>> Host: Logweb1
>>>> State: CRITICAL for 0d 0h 8m 6s
>>>> Address: logweb1.foo.com
>>>>
>>>> Info:
>>>>
>>>> Network is unreachable
>>>>
>>>> Date/Time: Fri Jan 5 16:29:28 EST 2007
>>>>
>>>> ACK by: Comment:
>>>>
>>>> What am I doing wrong?
>>>>
>>>> Thanks in advance,
>>>>
>>>> --- David
>>>>
>>>>
>>>>
>>>> -------------------------------------------------------------------------
>>>>
>>>> Take Surveys. Earn Cash. Influence the Future of IT
>>>> Join SourceForge.net's Techsay panel and you'll get the chance to
>>>> share your
>>>> opinions on IT & business topics through brief surveys - and earn cash
>>>> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
>>>>
>>>> _______________________________________________
>>>> Nagios-users mailing list
>>>> Nagios-users at lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/nagios-users
>>>> ::: Please include Nagios version, plugin version (-v) and OS when
>>>> reporting any issue. ::: Messages without supporting info will risk
>>>> being sent to /dev/null
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>> !DSPAM:37,459ee03d137101012410913!
>>
>>
>
>
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list