Setting up Escalations.

Giorgio Zarrelli zarrelli at linux.it
Tue Apr 6 22:31:05 CEST 2010


Hi,

the workflow for check/notification/escalation is the following:


1. The service/host is checked in OK state with the check_interval
timing;

2. As the service/host goes into a NON OK state, but didn't reach the
max_check_attempts, the service enters a SOFT NON OK state and the next
check is scheduled with the retry_interval timing;

3. As the service/host in NON OK state reaches the
max_check_attempts_value, the service enters an HARD NON OK STATE and
the next service/host check is scheduled with the check_interval timing;

4. Now, if you set first_notification_delay, this can alter the timing
for the first notification to be sent (0 means notifications to be sent
immediately);

5. If you didn't set the first_notification_delay, the first
notification is immediately sent and the following will be scheduled
with the notification_interval timing (0 means only the first
notification will be sent, the other will not be sent);

6. In your escalation, at the third notification (with the
notification_interval taking place), the notification interval changes
to 45 minutes, so the first notification will be sent after the
max_check_attempts value will be reached (assuming you didn't put any
delay), then the second after 10 time units, usually 10 minutes, the
third 10 mins after the second, the fourth 45 mins after the third, the
fifth 45 mins after the fourth, the sixth 45 mins after the fifth.


7. From the sixth notification, the new escalation comes into play. The
seventh notification will be sent after 60 minutes, and all the other
notifications will be sent 60 mins after each other. Keep in mind that
having used 0 as the last_notification value, you escalation will never
end till your check will return an OK status.


I do not know if I answered to your questions, I hope to have explained
the notification/escalation timings in a correct and clear manner.

On the Nagios wiki you will find a flowchart I wrote to clarify the
logic dealing with the notification framework.

Ciao,

Giorgio



Il giorno lun, 05/04/2010 alle 12.48 -0400, dOE ha scritto:
> Thanks Giorgia,
>
> Then for notification_interval for  production environment set to "10"
> would process the escalation after 10 minutes of the alert NIT being
> "ok"?, and subsequent notification_interval should be set to more than
> "10" so that they would then be notified too?
>
> Am I understanding this correctly?
>
> On Mon, Apr 5, 2010 at 6:34 AM, Giorgio Zarrelli <zarrelli at linux.it>
> wrote:
>         Hi,
>
>         First, local definitions win over those written in templates,
>         so if in the template you have a notification_interval value
>         and in the escalation you have another, escalation wins and
>         its value is adopted.
>
>         Second, notification_interval il the interval between two
>         consecutive notifications for a host or a service, after it
>         enters a non ok status and has exceeded max_check_attempts
>         value.
>
>         Ciao,
>
>         Giorgio
>
>         Il giorno 05/apr/2010, alle ore 04.50, dOE <doepain at gmail.com>
>         ha scritto:
>
>
>                 I am having a difficulty getting escalations to work
>                 on Nagios 3.0.3
>
>                 The following is pulled from the documentation:
>
>                 define serviceescalation{
>                        host_name               webserver
>                        service_description     HTTP
>                        first_notification      3
>                        last_notification       5
>                        notification_interval   45
>                        contact_groups          ITOps_Oncall,managers
>                        }
>
>                 define serviceescalation{
>                        host_name               webserver
>                        service_description     HTTP
>                        first_notification      6
>                        last_notification       0
>                        notification_interval   60
>                        contact_groups
>                 ITOps_Oncall,managers,everyone
>                        }
>
>                 I have read the documentation, but I don't understand
>                 what the "notification_interval" are based on, and
>                 sine we have hosts inheriting from a "core" template
>                 it is very difficult to test escalations.
>                 We use OpCfg to do our Nagios configuration, but it
>                 does not stop me from occasionally going into the
>                 actual configuration files to make changes either.
>
>                 If anyone has this working, and could shed some light
>                 on how I can get this to work, or clarify the
>                 documentation explanation of it.  Also, since I am
>                 inheriting from a template I feel as though the
>                 changes I make to a particular host (to test) is being
>                 ignored or it maybe me not understanding what the
>                 "notification_interval" are exactly.
>
>                 Any advice is very much appreciated.
>
>                 ------------------------------------------------------------------------------
>                 Download Intel® Parallel Studio Eval
>                 Try the new software tools for yourself. Speed
>                 compiling, find bugs
>                 proactively, and fine-tune applications for parallel
>                 performance.
>                 See why Intel Parallel Studio got high marks during
>                 beta.
>                 http://p.sf.net/sfu/intel-sw-dev
>                 _______________________________________________
>                 Nagios-users mailing list
>                 Nagios-users at lists.sourceforge.net
>                 https://lists.sourceforge.net/lists/listinfo/nagios-users
>                 ::: Please include Nagios version, plugin version (-v)
>                 and OS when reporting any issue.
>                 ::: Messages without supporting info will risk being
>                 sent to /dev/null
>
>




------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list