Dependency problem
Anastasios Zafeiropoulos
mls at freemail.gr
Wed Apr 7 21:58:56 CEST 2004
Hello world,
I'm having trouble with a Host dependency misconfiguration or why not, with a bug in Nagios' Dependency logic process and
notification.
I am using version nagios-1.2-0.rhfc1.dag which was a prebuilt package from Dag Apt repository site.
===================================================
My Topology:
===================================================
Nagios machine --- RT1 -- RT2 -- RT3
====================================================
The problem
====================================================
When RT1 goes down, or the RT1-RT2 Link goes down, Nagios will notice that at random, while he is checkong a service or
HOST_ALIVE function to any part of the network that is down. Let's assume that the first Host that Nagios found dead was RT3.
Nagios didn't get any reply from RT3, so RT3 will be kept in SOFT down state.
Next the RETRY proccess will take place. The max_check_attempts are 30 for each host. That's because the links are not
reliable at all so we want to be a little elastic with the Notifications.
At the time that we reach the Retry #30, Nagios assumes that RT3 IS DOWN, puts it in HARD DOWN state and looks to find any
dependencies associated with the RT3. If you look below, RT3 is dependent upon RT2. So it will continue with try pinging RT2.
While Nagios is trying to determine whether the RT2 is alive or not, suddendly, the RT1-RT2 link comes up and all the network
is now reachable by Nagios. I notice here that the max_checks_attempts havent timed out. So Nagios will take a response from
RT2 and it will put it in A HARD OK State.
The result will be NOT to check RT3 again to see if he is up as RT2. So, a notification will be sent reporting that RT3 is
down. This is FAKE. The whole network was down!
Below I provide you my configuration. Maybe sth goes wrong with my conf files.
Thanks in advance guys
====================================================
My dependecies.cfg file
====================================================
define hostdependency{
host_name RT2
dependent_host_name RT3
notification_failure_criteria d,u
}
define hostdependency{
host_name RT1
dependent_host_name RT2
notification_failure_criteria d,u
}
===================================================
My hosts.cfg
===================================================
define host{
use generic-host
host_name RT1
alias Wireless 1
address 213.5.0.34
check_command check-host-alive
max_check_attempts 30
notification_interval 0
notification_period 24x7
notification_options d,u
}
define host{
use generic-host
host_name RT2
alias tsapi.twmn
address 10.107.13.1
parents RT1
check_command check-host-alive
max_check_attempts 30
notification_interval 0
notification_period 24x7
notification_options d,u
}
define host{
use generic-host
host_name RT3
alias Wireless Internet
address 212.34.23.4
parents RT2
check_command check-host-alive
max_check_attempts 30
notification_interval 0
notification_period 24x7
notification_options d,u
}
____________________________________________________________________
http://www.freemail.gr - δωρεάν υπηρεσία ηλεκτρονικού ταχυδρομείου.
http://www.freemail.gr - free email service for the Greek-speaking.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20040407/5ba8c117/attachment.html>
More information about the Users
mailing list