[clug] Nagios escalations - help needed
Collins, Steve
Steve.Collins-TD049pqL5YylJ94bxmMSxg at public.gmane.org
Mon Sep 6 04:12:45 CEST 2004
I have the following need:
1. Monitor several services 24x7 on a group of hosts
2. During workhours, notify a limited group of users every 30 or 60 minutes (dependent on the service) by email if any of the services aren't working, and subsequent host notifications (which works just fine).
3. 24x7, escalate service notifications so that the following occur:
* on notification 2 and subsequent, SMS our oncall person every 6 hours
* on notification 2 ONLY, email our main client and our service desk
I get service and host notifications for what I need by email just fine, but the escalations don't seem to work. I had a server die on the weekend and go no SMS. Below are the (I believe) appropriate bits of my config files. I'd like to get the bits for webdev and webdev-oncall working. After that, I should be able to add in things like management, etc.
I'd greatly appreciate any advice on what and where I've gone wrong (which is surely the case). I know that notifications are very config dependent, and it's just the figuring out of where I've cruelled the config so that they are mucked up that I need help with. What I'd initially like to do is set it all up with a really close set of notification periods so I can test it all, and then push the periods back to reality.
I'm using Nagios 1.2 and the latest NagMIN for config editing.
Host.cfg (typical entry)
~~~~~~~~~~~~~~~~~~~~~~~~
define host {
use generic-host
host_name DONKEY
alias DONKEY Server
address 172.16.2.30
parents MacquarieRack
check_command check-host-alive
max_check_attempts 3
notification_interval 60
notification_period 24x7
notification_options d,u,r
}
HostGroup.cfg
~~~~~~~~~~~~~
define hostgroup {
hostgroup_name Macquarie_internet_zone
alias MCT Internet Zone
contact_groups online,webdev
members DONKEY (plus several others)
}
Contact.cfg
~~~~~~~~~~~
define contact {
use generic-contact
contact_name scollins
alias Stephen Collins
email steve.collins at industry.gov.au
service_notification_period 24x7
host_notification_period 24x7
service_notification_options w,u,c,r
host_notification_options d,u,r
service_notification_commands notify-by-email
host_notification_commands host-notify-by-email
}
define contact {
use generic-contact
contact_name servicedesk
alias Service Desk
email servicedesk at industry.gov.au
service_notification_period 24x7
host_notification_period 24x7
service_notification_options w,u,c,r
host_notification_options d,u,r
service_notification_commands notify-by-email
host_notification_commands host-notify-by-email
}
define contact {
use generic-contact
contact_name webdev-oncall
alias Web Development Team Oncall Member
pager 0421054024 at streetdata.com.au
service_notification_period nonworkhours
host_notification_period nonworkhours
service_notification_options w,u,c,r
host_notification_options d,u,r
service_notification_commands notify-by-epager
host_notification_commands host-notify-by-epager
}
Contactgroup.cfg
~~~~~~~~~~~~~~~~
define contactgroup {
contactgroup_name webdev
alias Web Development Team Staff
members brobinson,imacintosh,mwalsh,rbuerckner,scollins,sjanssens
}
define contactgroup {
contactgroup_name webdevoncall
alias Web Development Team Staff - Oncall
members imacintosh-sms,webdev-oncall
}
Service.cfg
~~~~~~~~~~~
define service {
use NM-HTTP
hostgroup_name Macquarie_internet_zone
service_description Check HTTP [MCT]
contact_groups webdev
check_period 24x7
notification_interval 30
notification_options w,u,c,r
notification_period 24x7
check_command check_http_mct
max_check_attempts 3
normal_check_interval 5
retry_check_interval 1
}
ServiceEscalation.cfg
~~~~~~~~~~~~~~~~~~~~~
define serviceescalation {
hostgroup_name Macquarie_internet_zone
service_description Check HTTP [MCT]
first_notification 2
last_notification 0
notification_interval 360
contact_groups webdevoncall
}
define serviceescalation {
hostgroup_name Macquarie_internet_zone
service_description Check HTTP [MCT]
first_notification 2
last_notification 2
notification_interval 60
contact_groups servicedesk,webpub
}
Thanks!
Steve
--
Stephen Collins
Web Development Section
eBusiness Division
__________________________________________________
Department of Industry, Tourism and Resources
Level 12, 20 Allara Street, Canberra City ACT 2600
GPO Box 9839, Canberra ACT 2601
E steve.collins at industry.gov.au
P +61 2 62137193
C +61 410 680722
F +61 2 62136227
**********************************************************************
The information contained in this e-mail, and any attachments to it, is
intended for the use of the addressee and is confidential. If you are not the intended recipient you must not use, disclose, read, forward, copy or retain any of the information. If you have received this e-mail in
error, please delete it and notify the sender by return e-mail or telephone.
The Commonwealth does not warrant that any attachments are free from viruses or any other defects. You assume all liability for any loss, damage or other consequences which may arise from opening or using the attachments.
***********************************************************************************
-------------- next part --------------
--
linux mailing list
linux-w/Ol4Ecudpl8XjKLYN78aQ at public.gmane.org
http://lists.samba.org/mailman/listinfo/linux
More information about the Users
mailing list