[Nagios-devel] FW: Timeperiods and oncall rotation with UK Public holidays
Daniel Rich
drich at employees.org
Thu Apr 22 18:45:24 CEST 2010
This is one of my pet peeves of Nagios -- that notifications are not particularly flexible. That is why we have two sets of contacts for everyone, one that notifies via pager and e-mail, and one that only notifies via. e-mail; as there isn't a way to have Nagios do that for us.
We do something very similar to what you are trying to do. We have a contact "groupname-oncall" that gets assigned to hosts and services a particular group is responsible for. That alias exists both in our mail system and our paging system, and is updated automatically by a script to point to the correct individual. The hardest part of this is writing the script to do the updates, as it has to be able to parse a configuration file with the dates/times your individuals are on-call and update the mail and paging systems with the correct information. In our case, the aliases are stored in LDAP, so it is trivial for us to make the updates.
For years I have wanted to find the time to write a back-end notification script for Nagios that would make notifications more flexible, I just haven't had the time. I want to be able to do things like:
o Notify via. e-mail on warnings but send a text message for critical or for some services/hosts send text-messages for everything
o Only notify the on-call individuals after hours but notify everyone during business hours
o Remove duplicate messages
o Allow for two-way messages so an reply can be sent via. e-mail or SMS to ack an issue (I almost have this in place today)
o Have a configuration file that drives all of the above
On Apr 22, 2010, at 02:42, Deborah Martin wrote:
>
> Is anybody able to help with this ?
>
> Thanks,
> Deborah
>
> From: Deborah Martin [mailto:Deborah.Martin at Kognitio.com]
> Sent: 21 April 2010 12:25
> To: nagios-users at lists.sourceforge.net
> Subject: [Nagios-users] Timeperiods and oncall rotation with UK Public holidays
> Importance: High
>
> Folks,
>
> I'm using SLES 10 and Nagios 3.2.0.
>
> We have 4 oncall engineers which rotate over a 4 week period, each being oncall one week at a time.
> The oncall period is 17:30 - 08:00 each working day and then the whole period for any weekend or UK public holiday.
>
> My definitions are :-
>
> define timeperiod{
> timeperiod_name 24x7
> alias 24 Hours A Day, 7 Days A Week
> sunday 00:00-24:00
> monday 00:00-24:00
> tuesday 00:00-24:00
> wednesday 00:00-24:00
> thursday 00:00-24:00
> friday 00:00-24:00
> saturday 00:00-24:00
> }
>
> This is for all normal monitoring of our systems.
>
> Each oncall engineer is defined :-
>
> define timeperiod{
> timeperiod_name person1-oncall
> alias person1-oncall
> 2010-03-29 / 28 17:30-24:00 ; Monday
> 2010-03-30 / 28 00:00-08:00,17:30-24:00 ; Tuesday
> 2010-03-31 / 28 00:00-08:00,17:30-24:00 ; Wednesday
> 2010-04-01 / 28 00:00-08:00,17:30-24:00 ; Thursday
> 2010-04-02 / 28 00:00-08:00,17:30-24:00 ; Friday
> 2010-04-03 / 28 00:00-24:00 ; Saturday
> 2010-04-04 / 28 00:00-24:00 ; Sunday
> 2010-04-05 / 28 00:00-08:00 ; Monday
> }
>
> define timeperiod{
> timeperiod_name person2-oncall
> alias person2-oncall
> 2010-04-05 / 28 17:30-24:00 ; Monday
> 2010-04-06 / 28 00:00-08:00,17:30-24:00 ; Tuesday
> 2010-04-07 / 28 00:00-08:00,17:30-24:00 ; Wednesday
> 2010-04-08 / 28 00:00-08:00,17:30-24:00 ; Thursday
> 2010-04-09 / 28 00:00-08:00,17:30-24:00 ; Friday
> 2010-04-10 / 28 00:00-24:00 ; Saturday
> 2010-04-11 / 28 00:00-24:00 ; Sunday
> 2010-04-12 / 28 00:00-08:00 ; Monday
> }
>
> define timeperiod{
> timeperiod_name person3-oncall
> alias person3-oncall
> 2010-04-12 / 28 17:30-24:00 ; Monday
> 2010-04-13 / 28 00:00-08:00,17:30-24:00 ; Tuesday
> 2010-04-14 / 28 00:00-08:00,17:30-24:00 ; Wednesday
> 2010-04-15 / 28 00:00-08:00,17:30-24:00 ; Thursday
> 2010-04-16 / 28 00:00-08:00,17:30-24:00 ; Friday
> 2010-04-17 / 28 00:00-24:00 ; Saturday
> 2010-04-18 / 28 00:00-24:00 ; Sunday
> 2010-04-19 / 28 00:00-08:00 ; Monday
> }
>
> define timeperiod{
> timeperiod_name person4-oncall
> alias person4-oncall
> 2010-04-19 / 28 17:30-24:00 ; Monday
> 2010-04-20 / 28 00:00-08:00,17:30-24:00 ; Tuesday
> 2010-04-21 / 28 00:00-08:00,17:30-24:00 ; Wednesday
> 2010-04-22 / 28 00:00-08:00,17:30-24:00 ; Thursday
> 2010-04-23 / 28 00:00-08:00,17:30-24:00 ; Friday
> 2010-04-24 / 28 00:00-24:00 ; Saturday
> 2010-04-25 / 28 00:00-24:00 ; Sunday
> 2010-04-26 / 28 00:00-08:00 ; Monday
> }
>
> I have escalations set for one particular client which will happen during oncall hours only and depending on the notification number, (4,5,6) will send an SMS alert to the relevant person oncall.
>
> ## Escalation ONE:
> define serviceescalation {
> host_name dbhost1
> service_description DB Conn Check
> first_notification 4
> last_notification 6
> notification_interval 15
> escalation_options c ; Only escalate for CRITICAL alerts
> escalation_period oncall
> contact_groups wx2-sms-oncall-group
> }
>
> define timeperiod{
> timeperiod_name oncall
> alias Oncall Hours
> sunday 00:00-24:00
> monday 00:00-08:00,17:30-24:00
> tuesday 00:00-08:00,17:30-24:00
> wednesday 00:00-08:00,17:30-24:00
> thursday 00:00-08:00,17:30-24:00
> friday 00:00-08:00,17:30-24:00
> saturday 00:00-24:00
> }
>
> And the sms-oncall-group defined for the service escalation includes all 4 oncall engineers but only the person actually oncall should get the sms alert based on their oncall timeperiods.
>
>
> define contactgroup{
> contactgroup_name wx2-sms-oncall-group
> alias WX2 Oncall
> members person1-oncall, person2-oncall, person3-oncall, person4-oncall
> }
>
> However, I've now hit a snag - how do I define UK public holidays periods as being 24 hours (particularly if they fall on a weekday) and put that timeperiod into each oncall engineers timeperiod so whoever is oncall on a particular UK public holiday will get the escalation alerts for the entire 24 hour period rather than the usual defined oncall period of "00:00-08:00 and 17:30-24:00"
>
> I'd rather not explicitly define a UK holiday date to an oncall engineer as this would need to be maintained. I'd rather just have to update
>
> the timeperiod if the person rota'ed cannot cover that particular timeperiod as this will be few and far between in comparison.
>
> If anymore info is required please let me know. I'm probably missing something obvious but I've read the docs over a few times and can't seem to see what I want to do in there.
>
> Any pointers, help would be really appreciated.
>
> Thanks,
> Deborah
>
>
>
>
>
>
> ***************************************************************************
> This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed.
>
> Any unauthorised distribution or copying is strictly prohibited.
> Whilst Kognitio Limited takes steps to prevent the transmission of viruses via e-mail, we can not guarantee that any email or attachment is free from computer viruses and you are strongly advised to undertake your own anti-virus precautions.
>
> Kognitio grants no warranties regarding performance, use or quality of any e-mail or attachment and undertakes no liability for loss or damage, howsoever caused.
>
> Kognitio Limited, a company registered in England and Wales. Registered number 0212 7833. Registered Office: 3a Waterside Park, Cookham Road, Bracknell, Berks, RG12 1RB. VAT number 864 4378 92.
>
> Kognitio Inc, a company incorporated in Delaware, principal office 180 North Stetson, Suite 3500, Chicago, IL 60601, USA
> ***************************************************************************
>
> ***************************************************************************
> This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed.
>
> Any unauthorised distribution or copying is strictly prohibited.
> Whilst Kognitio Limited takes steps to prevent the transmission of viruses via e-mail, we can not guarantee that any email or attachment is free from computer viruses and you are strongly advised to undertake your own anti-virus precautions.
>
> Kognitio grants no warranties regarding performance, use or quality of any e-mail or attachment and undertakes no liability for loss or damage, howsoever caused.
>
> Kognitio Limited, a company registered in England and Wales. Registered number 0212 7833. Registered Office: 3a Waterside Park, Cookham Road, Bracknell, Berks, RG12 1RB. VAT number 864 4378 92.
>
> Kognitio Inc, a company incorporated in Delaware, principal office 180 North Stetson, Suite 3500, Chicago, IL 60601, USA
> ***************************************************************************
> ------------------------------------------------------------------------------
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null------------------------------------------------------------------------------
> _______________________________________________
> Nagios-devel mailing list
> Nagios-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-devel
Dan Rich <drich at employees.org> | http://www.employees.org/~drich/
| "Step up to red alert!" "Are you sure, sir?
| It means changing the bulb in the sign..."
| - Red Dwarf (BBC)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20100422/aac66342/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list