[PATCH] notifications: Fix first_notification_delay
Jim Avery
jim at jimavery.me.uk
Wed Dec 12 20:55:53 CET 2012
On 12 Dec 2012 19:10, "Jochen Bern" <Jochen.Bern at linworks.de> wrote:
>
> On 11.12.2012 22:56, Jim Avery wrote:
> > We want to send an SMS notification if the UPS goes on to battery, but
only
> > if it has been on battery for more than, say, five minutes. I had hoped
> > that first_notification_delay would give me that possibility.
Obviously as
> > this is a passive check [...]
> >
> > Please forgive me that I don't understand the programmatical issues well
> > enough to see if any of the proposed solutions so far will fit this use
> > case.
>
> I'ld say that our discussion is perpendicular to the issue you raise -
> or, in other words, that first_notification_delay is unlikely to
> suddenly work the way you want afterwards.
>
> You see, while everyone just calls it "passive checks", the terminology
> in the web UI - "active checks disabled" - is more correct. (Yes, a
> service *could* have active *and* passive check results going into it.
> I've just never seen a working setup doing that, or a use case seriously
> asking for it.) Using "passive checks" means that Nagios can forget
> about scheduling commands for that service, ever; waiting for events and
> reacting to them is all that's required.
>
> Asking that Nagios now *should* schedule a notification being sent
> first_notification_delay after receiving a non-OK passive result sort of
> negates the entire concept.
>
> What I'ld try in your place is to write the (last) trap into a cache
> file, and then run an *active* check looking for the reported state *and
> timestamp*, possibly differentiating WARNING/CRITICAL based on the latter.
> J. Bern
Thanks Jochen,
yes it's probably unreasonable of me to expect the notification logic to
become so far divorced from the active check scheduler given that we are
where we are. And to be honest in my environment there is only (currently)
this one particular case where it's an issue, crafting a plugin to do
pretty much what you suggest shouldn't be a Big Deal.
In fact your comment about mixing active and passive checks got me thinking
I could probably have Nagios actively check the relevant OID every few
minutes as well as receiving the passive traps. I would then get all of
the events properly logged, but also would be able to delay the
notification by setting max_check_attempts and check_frequency
appropriately or even better use a service escalation so short-duration
events can be notified by email but longer duration events notified by SMS
too.
Cheers, and thanks again,
Jim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20121212/f14607cc/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
-------------- next part --------------
_______________________________________________
Nagios-devel mailing list
Nagios-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-devel
More information about the Developers
mailing list