Notification escalate too fast if no one notified

Shad L. Lords lists at lordsfam.net
Thu Aug 12 01:00:01 CEST 2004


>> What negative side effects did you observe.  I've had no problems on this 
>> side and notifications are going out as expected.  The way I see it if 
>> there isn't a person defined for a given escalation then it would fail 
>> back to the default contact.  So there will always be someone defined for 
>> a given notification number but they might not be notified because of out 
>> of hours. I still can't understand why you wouldn't want to increment 
>> notification numbers even if you don't notify someone.
>>
>
> The notification number is very useful in telling the boss (who only gets 
> notification number 5 or something) that his staff hasn't been doing a 
> very good job at fixing an issue they knew existed. If the notification 
> number is incremented without sending notifications, the boss might get 
> mad for no good reason. Besides, the natural value for notification_number 
> is number of notifications actually sent. Maybe adding a 
> suppressed_notifications variable and macro would help?

Again I ask for what negative side effect is seen.  The number represents
the number of notification that would have been sent/attempts to send.  Just
because a notification didn't get sent doesn't mean that it didn't try to
send it.


>> We have our notification set at 30 minutes.  We have the pager get 
>> notified at 1-3 tier 2 kicks in at 3-0 and managers get 5-5.  There are 
>> situations where tier 2 is set to work hours only.  However management 
>> wants to always be notified 24x7 at number 5.  With the patch I submitted 
>> things are working exactly as we want.
>
> But it would most likely break things elsewhere, so someone else will most 
> likely patch it back to work the way it did before.

What would break.  I think it is counter intuitive to say that I want
someone notified at the 5th 30 minute iterval (2 hours) and they never get
notified.  When setting things up you don't want to have to play around with
funky settings just so a notification 5 goes out.  You might have someone go
on vacation and you turn off his/her hours for a week and all of a sudden
upper levels don't get notified.  Just because someone doesn't exist or get
notified at a given level shouldn't negativly affect the upper levels.  The
way things are now it escalates things to the upper levels just a lot
quicker (at service check interval instead of notification interval).  So by
switching escalation off if nobody gets notified will break things more then
my patch.  People will have more of an issue if notifications that they were
getting all of a sudden stop coming.

>> If notification numbers aren't incremented unless someone gets notified 
>> then unless 4 went out managers would never get notified.
>>
>
> This is a configuration issue that can be worked around by using retry 
> intervals and max check attempts. If first notification goes nowhere, you 
> most likely have a very non-standard configuration, so any patches that 
> will make it easier for you to keep it that way is likely to break things 
> for hordes of other Nagios users.

I know the idea has been kicked around about how to make it so that you only
get notified if a system has been down 15 minutes.  You can do this by
playing with check and retry intervals, but you can also set it to
notification interval of 15 minutes and start sending pages at notification
number 2 and accomplish the same thing (with my patch).

-Shad



-------------------------------------------------------
SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media
100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33
Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift.
http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285




More information about the Developers mailing list