strange behavior with multiple failing hosts and nagios 1.3 / 2.1
Christian Lyra
lyra at pop-pr.rnp.br
Sat Apr 8 02:10:36 CEST 2006
Hi there,
I was evaluating nagios and found a strange behavior on my test setup. After
a fresh install, I did a minimal setup, just one contactgroup with one
contact. A hostgroup with 4 hosts (no parent relationship). Since I´m only
interested to know if a host is up or down I just configured a check_ping
service for each host. As I said, a pretty simple setup. The services is
schedulled to run every minute with a one try only.
To simulate a network problem, I just did a "iptables -A INPUT -p icmp -j
DROP". I was expecting that I would see all hosts/services down within a
minute, as nagios use to "spread" the checks within the one minute (default
configuration). To my suprise I saw just one host coming down on one minute,
with the subsequent hosts coming down each minute after that. I mean, host 1
comes down on, say, 8:40:13, host 2 on 8:41:05, host 3 on 8:42:05 and host 5
on host 8:43:05. I saw the last host come down almost 4 minutes after the
"network problem".
My first try was with nagios 1.3, but the I could reproduce the same problem
with nagios 2.1. When I asked a friend to do the same test, he got the same
results. A little worst, since he does not check the hosts/services every
minute, so he got a host down per 3 minutes, after 10 minutes he couldnt see
all the hosts down.
To my surprise, all the hosts come up about the same time after removing the
iptables rule. I could not find a explanation for this behavior, and couldnt
find anything wrong with the configuration. I´m not sure if this is a
feature, or if I hit a bug. A serious bug to be true.
I did a minimal search on the mailing list archives and forums, so excuse me
if this is know issue, and plz point me where I can find more about it.
Christian Lyra
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20060407/e25a0d2e/attachment.html>
More information about the Users
mailing list