BUG: Recovery notifications sent to contacts which never received the initial problem notification
CHRIS TSENG (ULI-HK)
CHRISTSENG at UnitedLuminous.com
Wed Aug 20 16:30:39 CEST 2008
Hello,
I am having the notification issue which I using is 3.03.
The email alert is hard to setup. Do you have any idea on it?
Many thanks,
Chris
Sent from my BlackBerry® wireless device
----- Original Message -----
From: nagios-devel-bounces at lists.sourceforge.net <nagios-devel-bounces at lists.sourceforge.net>
To: Nagios Developers List <nagios-devel at lists.sourceforge.net>
Sent: Wed Aug 20 22:25:08 2008
Subject: [Nagios-devel] BUG: Recovery notifications sent to contacts which never received the initial problem notification
Greetings,
it seems I triggered a bug with our new nagios instance, as it shows quite a strange behaviour.
Quoting from the nagios 3.x documentation: http://nagios.sourceforge.net/docs/3_0/notifications.html
Service and Host Filters:
"Note: Notifications about host or service recoveries are only sent out if a notification was sent out
for the original problem. It doesn't make sense to get a recovery notification for something you never
knew was a problem... "
This is what happened:
1. Service went CRITICAL -> Notifications to the contacts user1-mail, user2-mail
2. Service went WARNING -> Notifications to the contacts user1-mail, user2-mail
3. Service went OK -> Notifications to the contacts user1-mail,user2-mail,user1-sms,user2-sms
vmctx02 CPU CRITICAL 18-08-2008 16:24:50 user1-mail mail-notification CRITICAL: 15m: average load 100% critical
vmctx02 CPU CRITICAL 18-08-2008 16:24:50 user2-mail mail-notification CRITICAL: 15m: average load 100% critical
vmctx02 CPU WARNING 18-08-2008 16:31:50 user1-mail mail-notification WARNING: 15m: average load 99% warning
vmctx02 CPU WARNING 18-08-2008 16:31:50 user2-mail mail-notification WARNING: 15m: average load 99% warning
vmctx02 CPU OK 18-08-2008 16:32:50 user1-sms sms-notification OK: 15m: average load 92%
vmctx02 CPU OK 18-08-2008 16:32:50 user2-sms sms-notification OK: 15m: average load 92%
vmctx02 CPU OK 18-08-2008 16:32:50 user1-mail mail-notification OK: 15m: average load 92%
vmctx02 CPU OK 18-08-2008 16:32:50 user2-mail mail-notification OK: 15m: average load 92%
I do not understand why the 2 sms contacts were notified, they never received a
problem notification in first place. It was an escalation which triggered those sms -
but it shouldn't have in my opinion. It seems it only happens in our environment, if
exactly 2 notifications were sent before a recovery.
These are the relevant configs:
Contacts and Templates (user1 and user2 are identical):
define contact {
name generic-contact-mail
host_notification_period 24x7
service_notification_period 24x7
host_notification_options d,r
service_notification_options u,c,w,r
host_notification_commands mail-notification
service_notification_commands mail-notification
register 0
}
define contact {
contact_name user1-mail
use generic-contact-mail
alias User1
email user1 at firma.com
}
define contact {
name generic-contact-sms
host_notification_period 24x7
service_notification_period 24x7
host_notification_options d,r
service_notification_options u,c,r
host_notification_commands sms-notification
service_notification_commands sms-notification
register 0
}
define contact {
contact_name user1-sms
use generic-contact-sms
alias S R
pager +49-DONT-CALL-ME
}
Service Templates and Service:
define service {
name generic-service
is_volatile 0
check_period 24x7
max_check_attempts 3
normal_check_interval 1
retry_check_interval 3
active_checks_enabled 1
passive_checks_enabled 1
parallelize_check 1
obsess_over_service 0
check_freshness 1
freshness_threshold 120
notifications_enabled 1
notification_interval 60
notification_period 24x7
notification_options u,c,w,r
event_handler_enabled 1
flap_detection_enabled 1
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
register 0
}
define service {
service_description CPU
use generic-service
host_name vmctx01
check_command check_nrpe_cpu!99%!100%
}
Service Escalation Templates and Escalations: (the escalation_period at that time was workhours)
define serviceescalation {
name service-minor-nonworkhours
first_notification 4
last_notification 4
notification_interval 60
escalation_period nonworkhours
escalation_options r,c
register 0
}
define serviceescalation {
name service-minor-workhours
first_notification 2
last_notification 2
notification_interval 60
escalation_period workhours
escalation_options r,c
register 0
}
define serviceescalation {
use service-minor-nonworkhours
host_name essctxsir06,essctx10,essctx04,essctxulg04,essctx11,essctxulg03,essctxsir03,essctxj0
1,essctxsir02,essctx03,essctxb06,essctxulg02,essctxsir05,essctxb01,essctxulg05,essctxulg01,essctx07,essctxulg06,essctxtest0
1,essctxtest01a,vmctx01,vmctx02,vmctx03,vmctx05,vmnrzctxulg03,vmnrzctxulg02,vmnrzctxulg01,nrzctxsir02,nrzctxsir01,nrzctxpps
02,nrzctxpps01,nrzctxpcs01,nrzctxpcs02,vmnrzctxpcs02
service_description *
contact_groups citrixadmins,citrixadmins-sms
}
define serviceescalation {
use service-minor-workhours
host_name essctxsir06,essctx10,essctx04,essctxulg04,essctx11,essctxulg03,essctxsir03,essctxj0
1,essctxsir02,essctx03,essctxb06,essctxulg02,essctxsir05,essctxb01,essctxulg05,essctxulg01,essctx07,essctxulg06,essctxtest0
1,essctxtest01a,vmctx01,vmctx02,vmctx03,vmctx05,vmnrzctxulg03,vmnrzctxulg02,vmnrzctxulg01,nrzctxsir02,nrzctxsir01,nrzctxpps
02,nrzctxpps01,nrzctxpcs01,nrzctxpcs02,vmnrzctxpcs02
service_description *
contact_groups citrixadmins,citrixadmins-sms
}
--
Sascha Runschke
Netzwerk- und Systemmanagement
Telefon : +49 (201) 102-1879 Mobil : +49 (173) 5419665 Fax : +49 (201) 102-1102105
GFKL Financial Services AG
Vorstand: Dr. Peter Jänsch (Vors.), Jürgen Baltes, Dr. Till Ergenzinger, Dr. Tom Haverkamp
Vorsitzender des Aufsichtsrats: Dr. Georg F. Thoma
Sitz: Limbecker Platz 1, 45127 Essen, Amtsgericht Essen, HRB 13522
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20080820/07f845b9/attachment.html>
-------------- next part --------------
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
-------------- next part --------------
_______________________________________________
Nagios-devel mailing list
Nagios-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-devel
More information about the Developers
mailing list