BUG: Passive host check results are always in HARD state.
jan.david at agfa.com
jan.david at agfa.com
Tue Jul 4 17:44:58 CEST 2006
Hi,
We have a distributed Nagios set-up with three (slave) check engines
performing active checks and sending their results to a master server
which collects all results and sends out alarms if need be.
Our department had a lot of complaints regarding remote hosts connected
over a WAN link that give out a lot of false positives.
Because WAN links are more prone to packet loss than LAN links, we've set
the number of host retries to 10, figuring that this would avoid any false
alerts about hosts being down while in fact it is just a temporary glitch
in the line.
This setup did not work however. Further investigation about the cause
revealed what I believe to be a bug.
While receiving host check results in PASSIVE mode, the number of retries
is not taken into account and a negative response will immediately results
in a HARD state, which in turn sends out alerts.
This is a very annoying bug because it can create a lot of unnecessary
notifications if you're monitoring a machine over a WAN link.
I've first experienced this bug while running nagios 2.2 and have recently
upgraded to 2.4 to no avail.
In our normal setup, a slave machine would perform an active host check
and send the result through nsca to the master server. But it is not
necessary to reproduce the buggy behaviour. You can easily do it as
follows:
1) Pick a machine in your nagios configuration that you can play with.
As you can see from the first screenshot, the machine is currently in
attempt 1/10, state type HARD and last result was passive:
2) Click on "Submit passive check result for this host"
3) Commit and wait a minute:
As can be seen, the passive check immediately results in a HARD state,
even though the attempt is only 1/10.
Note that PASSIVE services checks work as expected, it's only host checks
that exhibit this behaviour.
Would it be possible to post a patch for this bug or could a fix be
incoporated in a next release?
Best Regards,
Jan David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20060704/770865f4/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 47452 bytes
Desc: not available
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20060704/770865f4/attachment.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 35289 bytes
Desc: not available
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20060704/770865f4/attachment-0001.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 42080 bytes
Desc: not available
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20060704/770865f4/attachment-0002.gif>
-------------- next part --------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
-------------- next part --------------
_______________________________________________
Nagios-devel mailing list
Nagios-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-devel
More information about the Developers
mailing list