a question on hierarchical recovery notifications
Eric Young
ericryoung at yahoo.com
Mon Oct 21 13:44:12 CEST 2002
I haven't been able to figure this one out yet.
Hopefully you can help. I'm considering putting
Nagios on our network but just ran into a problem on
Friday. I 'faked' a major network failure by
essentially turning on ipchains to block all icmp from
my Nagios host. I was testing about 150 hosts with 1
router as the parent of all. I set notifications on
the 'children' to only 'd,r' (after getting many
notifications for 'unreachable' on my first attempt)
When I turned on my new ipchains rule, the network
went down as expected and I only received notification
for the router. Woohoo!
Now, here's the problem. I left it like that for a
while and when I brought it back up (ie: simulating
the routers return to life), I got not only an up
notification for the router but quite a few 'recovery'
pages for the child nodes.
So, in the case of major failure like this, I'd rather
not get pages for things that were never really down
(ie: they were just unreachable) but I don't know of a
way to set it so that if a node was only unreachable
and had a parent that was down, that I don't get pages
for that node (I guess unless it had been checked
again for the correct numbers of times).
Any suggestions? Am I missing something?
__________________________________________________
Do you Yahoo!?
Y! Web Hosting - Let the expert host your web site
http://webhosting.yahoo.com/
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
More information about the Users
mailing list