Passive services stale problem
Mooney, Ryan
ryan.mooney at pnl.gov
Fri May 30 00:22:28 CEST 2003
Hello,
I have a minor problem with setting up correct staleness checking for passive services.
The scenario is basically like this:
Host goes down (ok)
Data is not available from that host so the passive service check times out
(ok, passive service "is stale" script checks $HOSTSTATE$ and
if DOWN/UNKNOWN returns OK so it doesn't even show on the web
much less send an alert).
Host comes up some (indeterminate) time later (ok)
Before the passive service "injector" runs the service staleness checker
times out and runs the staleness script; this triggers a "service is stale"
alert even though its really not, it just hasn't had time to update yet.
What I really need is a way to see how long ago the host came back up so that
if its within the passive service check run interval I can just return either
ok or a warning, but I didn't see a macro that differentiated host vs service
last state change.
I'm running this with ~1000 hosts that are being rebooted a lot (and a bunch of
services) so I get a LOT more spurious alerts than you might initially imagine.
I suppose I could do something like set an event to happen on host state change
that logged it somewhere and have the "is stale" script check that, but I was
hoping for a simpler and more elegant solution.
If anyone has any ideas, let me know.
-------------------------------------------------------
This SF.net email is sponsored by: eBay
Get office equipment for less on eBay!
http://adfarm.mediaplex.com/ad/ck/711-11697-6916-5
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list