[PATCH] notifications: Fix first_notification_delay

Jochen Bern Jochen.Bern at LINworks.de
Mon Dec 10 18:27:32 CET 2012


On 10.12.2012 16:44, Andreas Ericsson wrote:
> AFAIR, the original use-case [of first_notification_delay] was
> to allow operators to react to HARD alerts and acknowledge or
> fix them before notifications were sent out.

At least, that's how some organizations *did and do* use it. And some
probably also add the on-turning-HARD event handler execution to the mix
of things that hopefully might make notifications unnecessary in the
last second.

FWIW, from a rather principles-oriented point of view, the sequence of
SOFT non-OK --> HARD non-OK --> first notification (with the different
degrees of visibility these states imply) is as much a part of the
system of escalations as the part *called* "escalations" is. I wonder
whether a long-term consolidation of terms and mechanisms might prove
beneficial.

> * If delaying the notification causes it to end up in a time where
>   notifications should be sent, it should be sent even if the time of
>   the alert happened during a period when no notifications should have
>   been sent.
> * If delaying the check causes it to switch to a state which should
>   not result in a notification, no notification should be sent out.

(That's how escalations *already* behave WRT earlier non- or
lesser-escalated notifications, isn't it? Hence, The Right Thing To Do
(tm) in my books.)

> * Delaying a notification should not increase its notification_number,
>   and will, as such, affect both regular and escalated notifications.

*Most definitely* agreed! I know several organizations which would be
confused to no end if I had to tell them that, under certain
circumstances, there *just was no* notification #n preceding the #n+1
they received and try to figure out.

> * Custom-, downtime-, acknowledgement and flapping notifications will
>   never be delayed (flapping is arguable, but matches current code).

I am not aware, off the top of my head, of how Acknowledgment and
Flapping notifications are supposed to behave WRT earlier notifications
(as in "RECOVERYs are only sent to contacts who also had the PROBLEM
sent to them"). If such a dependency does/should/will exist, whether or
not to exempt them from first_notification_delay translates into
potentially different sets of recipients.

For acknowledgments, sending the notification early (and to the
*restricted* set of recipients) is likely what the person acknowledging
the problem *wants* to happen. FWIW, same thing for Downtimes, which are
technically prophetic acknowledgments. ;-)

Customs can probably lean both ways, depending on what you use them for.

Flapping ....... I'll have to pass on that. The things I monitor do not
really flap, and flapping detection is typically disabled.

Regards,
								J. Bern
-- 
*NEU* - NEC IT-Infrastruktur-Produkte im <http://www.linworks-shop.de/>:
Server--Storage--Virtualisierung--Management SW--Passion for Performance
Jochen Bern, Systemingenieur --- LINworks GmbH <http://www.LINworks.de/>
Postfach 100121, 64201 Darmstadt | Robert-Koch-Str. 9, 64331 Weiterstadt
PGP (1024D/4096g) FP = D18B 41B1 16C0 11BA 7F8C DCF7 E1D5 FAF4 444E 1C27
Tel. +49 6151 9067-231, Zentr. -0, Fax -299 - Amtsg. Darmstadt HRB 85202
Unternehmenssitz Weiterstadt, Geschäftsführer Metin Dogan, Oliver Michel

------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d




More information about the Developers mailing list