Notification interval & escalation overlapping
Ethan Galstad
nagios at nagios.org
Sun May 21 21:34:54 CEST 2006
Dirk De Coninck wrote:
> Hi all,
>
> First I would like to thank the Nagios developers for providing us this
> wonderful tool and sharing it with the community.
>
> I am using Nagios 2.2 and there seems to be a bug when using escalation
> overlapping with different notification intervals.
> What I want to achieve is this:
> For all service and hosts notifications an email is to be sent to the
> system administrators list with a notification interval of 20 minutes.
> When a critical or down notification (not for warnings) is not
> acknowledged within 20 minutes an email should be sent to the managers
> list.
> The managers only want to get 1 notification to report the critical or
> down state and a recovery notification whenever the status is recovered
> but only for recoveries that they initially got a notification for.
>
> To achieve this I created the following escalation definition templates:
> define serviceescalation{
> name mgmt
> first_notification 2
> last_notification 3 ; if I put 2 here, they never get the
> recovery notification
> contact_groups mgmt
> notification_interval 0 ; if I put 20 here, they get 2
> critical notifications and no recovery
> escalation_period 24x7
> escalation_options u,c,r
> register 0
> }
>
> define serviceescalation{
> name admins
> first_notification 3
> last_notification 0
> contact_groups admins
> notification_interval 20
> escalation_period 24x7
> escalation_options w,u,c,r
> register 0
> }
>
> define hostescalation{
> name host-mgmt
> first_notification 2
> last_notification 3
> contact_groups mgmt
> notification_interval 0
> escalation_period 24x7
> escalation_options d,u,r
> register 0
> }
>
> define hostescalation{
> name host-admins
> first_notification 3
> last_notification 0
> contact_groups admins
> notification_interval 20
> escalation_period 24x7
> escalation_options d,u,r
> register 0
> }
>
> All hosts and services have the admins as contact group with a 20
> minutes notification interval.
> First I tried only adding an escalation entry for the mgmt group (worked
> fine in version 1.3):
> define serviceescalation{
> use mgmt
> hostgroup_name internet-servers
> service_description PING
> }
>
> define hostescalation{
> use host-mgmt
> hostgroup_name internet-servers
> }
>
> The result is that I get the first notification to the admins and 20
> minutes later I get the escalation notification to the mgmt and then
> nothing anymore.
>
> Then I tried adding an escalation for the admins:
> define serviceescalation{
> use admins
> hostgroup_name internet-servers
> service_description PING
> }
>
> define hostescalation{
> use admins
> hostgroup_name internet-servers
> }
>
> But no result either.
>
> The only way I can almost make it work is by changing the management
> template notification interval to 20 and setting the last notification
> to 2. The admins get the reminders this way, but the mgmt never gets the
> recovery notification.
>
> Just a thought, but what would make everything a lot easier is the
> possibility of defining first_notification, last_notification,
> notification_interval, escalation_period and escalation_options in the
> contacts definition (contacts.cfg).
> Since it is advised to work with templates to make things more easy, why
> don't the service definitions inherit most of the parameters like
> contact groups, escalations, notification interval... from the host
> definitions unless specified in the service definition?
> Do I submit this as a feature requests?
>
> Sorry for this long email, I wanted to supply all relevant information.
> Thanks for any help you can give me to make this work.
>
> Kind regards,
> Dirk.
>
Nagios will only send recovery notifications to the contact(s) that last
received a problem notification. This means that if the problem
persists and is escalation past the management team, the admins will
receive recovery alerts, but the managers will not.
One possible solution would to include the managers in the contact
groups that get notified for subsequent (>2) alerts. Do this by
creating duplicate contact definitions for the managers and specifying
only "r" for the notification options. This will mean that the managers
only receive recovery alerts, no matter how many problem alerts get sent
out to the admins.
Ethan Galstad,
Nagios Developer
---
Email: nagios at nagios.org
Website: http://www.nagios.org
-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
More information about the Developers
mailing list