nagios on call schedule w/ escalations?
Charlie Reddington
charlie.reddington at gmail.com
Thu Oct 2 23:42:41 CEST 2008
Jon thanks. I got things figured out.
I setup 2 sets of contacts with the same users. One was just for the
regular contact. I setup this group of 'admins' so they are only
contacted on their oncall schedule.
I then just did nearly exactly as you wrote and made a totally
seperate set of contacts, that can be contacted 24x7.
I have 2 groups. Admins and Escalations.
Escalations use the second set of 24x7 contacts, and the Admins
contacts uses the oncall schedule.
Inheritance wasn't really necessary, just the separate groups.
Oh and I made a separate contact template that used the proper contact
time period.
Thanks again, works perfect.
charlie
On Oct 2, 2008, at 3:13 AM, Jon Angliss wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On Tue, 30 Sep 2008 16:22:16 -0500, Charlie Reddington
> <charlie.reddington at gmail.com> wrote:
>
>> Hi guys / gals,
>>
>> I am working on the final stages of my nagios setup, but I'm entering
>> territory which I haven't been before and can use some guidance.
>
> I'm sure you've probably taken a peek at the "On Call Rotations"
> details in the documentation:
>
> http://nagios.sourceforge.net/docs/3_0/oncallrotation.html
>
> There are plenty of examples to get a good idea.
>
>> Here's what I'm trying to achieve. We have a team of 3 admins, where
>> we rotate weeks who is on call. Of course, they aren't every other
>> 3rd
>> week , because of people having vacation time, etc. So some weeks
>> people are on call for 2 weeks, or every 2 weeks, etc.
>>
>> What we'd like is, to have a schedule setup where the primary guy
>> gets
>> woken up first. But if he doesn't answer his call after an hour, it
>> drops down to the rest of us admins. No matter if your just at home
>> sleeping, or if your on vacation, you get pinged. After that it goes
>> up to our manager.
>
>> I can figure out the setting of people's initial schedule, as I have
>> it looking something like this....
>>
>> # contacts
>>
>> define contact{
>> contact_name user1
>> use generic-contact
>> alias user1
>> email user1
>> host_notification_period user1_oncall
>> service_notfication_period user1_oncall
>> }
>>
>> define contact{
>> contact_name user2
>> use generic-contact
>> alias user2
>> email user2
>> host_notification_period user2_oncall
>> service_notfication_period user2_oncall
>> }
>>
>> define contact{
>> contact_name user3
>> use generic-contact
>> alias user3
>> email user3
>> host_notification_period user3_oncall
>> service_notfication_period user3_oncall
>> }
>> define contact{
>> contact_name manager1
>> use generic-contact
>> email manager1
>> }
>>
>> # groiups
>>
>> define contactgroup{
>> contact_groupname admins
>> members user1,user2,user3
>> }
>> define contactgroup{
>> contact_groupname managers
>> members manager1
>> }
>>
>> # Time periods
>>
>> define timeperiod{
>> timeperiod_name user1_oncall
>> Sept 29 - Oct 5 00:00-24:00
>> Oct 20 - Oct 26 00:00-24:00
>> Nov 17 - Nov 23 00:00-24:00
>> Dec 1 - Dec 7 00:00-24:00
>> Dec 15 - Dec 21 00:00-24:00
>> }
>>
>> define timeperiod{
>> timeperiod_name user2_oncall
>> Oct 6 - Oct 12 00:00-24:00
>> Nov 3 - Nov 9 00:00-24:00
>> Nov 24 - Nov 30 00:00-24:00
>> Dec 22 - Dec 23 00:00-24:00
>> }
>>
>> define timeperiod{
>> timeperiod_name user3_oncall
>> Oct 13 - Oct 19 00:00-24:00
>> Oct 27 - Nov 2 00:00-24:00
>> Nov 10 - Nov 16 00:00-24:00
>> Dec 8 - Dec 14 00:00-24:00
>> }
>
>> Would / Does escalations trump the initial contacts?
>>
>> # First escalations
>> define serviceescalation{
>> hostgroup_name Servers
>> service_description *
>> first_notification 2
>> last_notification 3
>> notification_interval 30
>> contact_groups admins
>> }
>>
>> # Second escalations
>> define serviceescalation{
>> hostgroup_name Servers
>> service_description *
>> first_notification 3
>> last_notification 8
>> notification_interval 60
>> contact_groups admins,managers
>> }
>>
>> So I know this isn't quite right, as our admins are part of the admin
>> group, but also trying to restrict when they get contacted. So I'm
>> not
>> really sure how to proceed with this.
>
> You might want to read up on notifications, and serviceescalations,
> too... Looking at the time stuff you've got, what'll happen is at any
> one point, only 1 of the admins will be reachable by notifications at
> any time. This is because the "timeperiods" stop nagios from sending
> notifications to a user that is outside their timeperiod. For
> example, a host goes down at 2100 on Oct 15th, only user3 will be
> notified, even after the escalations kick in. There will be a period
> of 0-3 notifications where user3 is the only recipient. It'll only
> get to another person when the 3rd notification goes out, and it
> engages the "managers" contact group.
>
> Depending on how many users/admins you're looking at, you could use a
> trick with templating, and inheritence. Keeping your base users as you
> have above, then build escalation users' and groups.
>
> define timeperiod {
> timeperiod_name AllTimes
> alias All Times
> sunday 00:00-24:00
> monday 00:00-24:00
> tuesday 00:00-24:00
> wednesday 00:00-24:00
> thursday 00:00-24:00
> friday 00:00-24:00
> saturday 00:00-24:00
> }
>
> define contact {
> contact_name disable_times
> host_notification_period AllTimes
> service_notification_period AllTimes
> register 0
> }
>
> define contact{
> contact_name user1
> use generic-contact
> alias user1
> email user1
> host_notification_period user1_oncall
> service_notfication_period user1_oncall
> }
>
> define contact{
> contact_name user2
> use generic-contact
> alias user2
> email user2
> host_notification_period user2_oncall
> service_notfication_period user2_oncall
> }
>
> define contact {
> use disable_times,user1
> contact_name user1_esc
> }
>
> define contact {
> user disable_times,user2
> contact_name user2_esc
> }
>
> define contactgroup {
> contactgroup_name admins
> members user1,user2
> }
>
> define contactgroup {
> contactgroup_name admins_esc
> members user1_esc,user2_esc
> }
>
> Then your service escalations use admins_esc instead of just admins.
> I've not tested it, but looking at the way inheritence works, you
> should be OK.
>
> - --
> Jon Angliss
>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (MingW32) - GPGshell v3.64
>
> iEYEARECAAYFAkjkgqMACgkQK4PoFPj9H3MthQCg4XgD5eNyl190umm7Ew8OouKK
> kCoAoNsRdPjpTMX/tO/eC00ejVb3MjzF
> =XHky
> -----END PGP SIGNATURE-----
>
>
> -------------------------------------------------------------------------
> This SF.Net email is sponsored by the Moblin Your Move Developer's
> challenge
> Build the coolest Linux based applications with Moblin SDK & win
> great prizes
> Grand prize is a trip for two to an Open Source event anywhere in
> the world
> http://moblin-contest.org/redirect.php?banner_id=100&url=/
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list