<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=utf-8">
<META NAME="Generator" CONTENT="MS Exchange Server version 6.0.6617.6">
<TITLE>Re: [Nagios-devel] BUG: Recovery notifications sent to contacts which never received the initial problem notification</TITLE>
</HEAD>
<BODY>
<!-- Converted from text/plain format -->
<P><FONT SIZE=2>Hello,<BR>
<BR>
I am having the notification issue which I using is 3.03.<BR>
The email alert is hard to setup. Do you have any idea on it?<BR>
<BR>
Many thanks,<BR>
<BR>
Chris<BR>
Sent from my BlackBerry® wireless device<BR>
<BR>
----- Original Message -----<BR>
From: nagios-devel-bounces@lists.sourceforge.net <nagios-devel-bounces@lists.sourceforge.net><BR>
To: Nagios Developers List <nagios-devel@lists.sourceforge.net><BR>
Sent: Wed Aug 20 22:25:08 2008<BR>
Subject: [Nagios-devel] BUG: Recovery notifications sent to contacts which never received the initial problem notification<BR>
<BR>
<BR>
Greetings,<BR>
<BR>
it seems I triggered a bug with our new nagios instance, as it shows quite a strange behaviour.<BR>
Quoting from the nagios 3.x documentation: <A HREF="http://nagios.sourceforge.net/docs/3_0/notifications.html">http://nagios.sourceforge.net/docs/3_0/notifications.html</A><BR>
Service and Host Filters:<BR>
<BR>
"Note: Notifications about host or service recoveries are only sent out if a notification was sent out<BR>
for the original problem. It doesn't make sense to get a recovery notification for something you never<BR>
knew was a problem... "<BR>
<BR>
This is what happened:<BR>
<BR>
1. Service went CRITICAL -> Notifications to the contacts user1-mail, user2-mail<BR>
2. Service went WARNING -> Notifications to the contacts user1-mail, user2-mail<BR>
3. Service went OK -> Notifications to the contacts user1-mail,user2-mail,user1-sms,user2-sms<BR>
<BR>
vmctx02 CPU CRITICAL 18-08-2008 16:24:50 user1-mail mail-notification CRITICAL: 15m: average load 100% critical<BR>
vmctx02 CPU CRITICAL 18-08-2008 16:24:50 user2-mail mail-notification CRITICAL: 15m: average load 100% critical<BR>
vmctx02 CPU WARNING 18-08-2008 16:31:50 user1-mail mail-notification WARNING: 15m: average load 99% warning<BR>
vmctx02 CPU WARNING 18-08-2008 16:31:50 user2-mail mail-notification WARNING: 15m: average load 99% warning<BR>
vmctx02 CPU OK 18-08-2008 16:32:50 user1-sms sms-notification OK: 15m: average load 92%<BR>
vmctx02 CPU OK 18-08-2008 16:32:50 user2-sms sms-notification OK: 15m: average load 92%<BR>
vmctx02 CPU OK 18-08-2008 16:32:50 user1-mail mail-notification OK: 15m: average load 92%<BR>
vmctx02 CPU OK 18-08-2008 16:32:50 user2-mail mail-notification OK: 15m: average load 92%<BR>
<BR>
I do not understand why the 2 sms contacts were notified, they never received a<BR>
problem notification in first place. It was an escalation which triggered those sms -<BR>
but it shouldn't have in my opinion. It seems it only happens in our environment, if<BR>
exactly 2 notifications were sent before a recovery.<BR>
<BR>
These are the relevant configs:<BR>
<BR>
<BR>
Contacts and Templates (user1 and user2 are identical):<BR>
<BR>
<BR>
define contact {<BR>
name generic-contact-mail<BR>
host_notification_period 24x7<BR>
service_notification_period 24x7<BR>
host_notification_options d,r<BR>
service_notification_options u,c,w,r<BR>
host_notification_commands mail-notification<BR>
service_notification_commands mail-notification<BR>
register 0<BR>
}<BR>
<BR>
define contact {<BR>
contact_name user1-mail<BR>
use generic-contact-mail<BR>
alias User1<BR>
email user1@firma.com<BR>
}<BR>
<BR>
define contact {<BR>
name generic-contact-sms<BR>
host_notification_period 24x7<BR>
service_notification_period 24x7<BR>
host_notification_options d,r<BR>
service_notification_options u,c,r<BR>
host_notification_commands sms-notification<BR>
service_notification_commands sms-notification<BR>
register 0<BR>
}<BR>
<BR>
define contact { <BR>
contact_name user1-sms<BR>
use generic-contact-sms<BR>
alias S R<BR>
pager +49-DONT-CALL-ME<BR>
} <BR>
<BR>
<BR>
Service Templates and Service:<BR>
<BR>
<BR>
define service {<BR>
name generic-service<BR>
is_volatile 0<BR>
check_period 24x7<BR>
max_check_attempts 3<BR>
normal_check_interval 1<BR>
retry_check_interval 3<BR>
active_checks_enabled 1<BR>
passive_checks_enabled 1<BR>
parallelize_check 1<BR>
obsess_over_service 0<BR>
check_freshness 1<BR>
freshness_threshold 120<BR>
notifications_enabled 1<BR>
notification_interval 60<BR>
notification_period 24x7<BR>
notification_options u,c,w,r<BR>
event_handler_enabled 1<BR>
flap_detection_enabled 1<BR>
process_perf_data 1<BR>
retain_status_information 1<BR>
retain_nonstatus_information 1<BR>
register 0<BR>
}<BR>
<BR>
define service {<BR>
service_description CPU<BR>
use generic-service<BR>
host_name vmctx01<BR>
check_command check_nrpe_cpu!99%!100%<BR>
}<BR>
<BR>
<BR>
Service Escalation Templates and Escalations: (the escalation_period at that time was workhours)<BR>
<BR>
<BR>
define serviceescalation {<BR>
name service-minor-nonworkhours<BR>
first_notification 4<BR>
last_notification 4<BR>
notification_interval 60<BR>
escalation_period nonworkhours<BR>
escalation_options r,c<BR>
register 0<BR>
} <BR>
<BR>
<BR>
define serviceescalation {<BR>
name service-minor-workhours<BR>
first_notification 2<BR>
last_notification 2<BR>
notification_interval 60<BR>
escalation_period workhours<BR>
escalation_options r,c<BR>
register 0<BR>
}<BR>
<BR>
define serviceescalation {<BR>
use service-minor-nonworkhours<BR>
host_name essctxsir06,essctx10,essctx04,essctxulg04,essctx11,essctxulg03,essctxsir03,essctxj0<BR>
1,essctxsir02,essctx03,essctxb06,essctxulg02,essctxsir05,essctxb01,essctxulg05,essctxulg01,essctx07,essctxulg06,essctxtest0<BR>
1,essctxtest01a,vmctx01,vmctx02,vmctx03,vmctx05,vmnrzctxulg03,vmnrzctxulg02,vmnrzctxulg01,nrzctxsir02,nrzctxsir01,nrzctxpps<BR>
02,nrzctxpps01,nrzctxpcs01,nrzctxpcs02,vmnrzctxpcs02<BR>
service_description *<BR>
contact_groups citrixadmins,citrixadmins-sms<BR>
}<BR>
<BR>
<BR>
define serviceescalation {<BR>
use service-minor-workhours<BR>
host_name essctxsir06,essctx10,essctx04,essctxulg04,essctx11,essctxulg03,essctxsir03,essctxj0<BR>
1,essctxsir02,essctx03,essctxb06,essctxulg02,essctxsir05,essctxb01,essctxulg05,essctxulg01,essctx07,essctxulg06,essctxtest0<BR>
1,essctxtest01a,vmctx01,vmctx02,vmctx03,vmctx05,vmnrzctxulg03,vmnrzctxulg02,vmnrzctxulg01,nrzctxsir02,nrzctxsir01,nrzctxpps<BR>
02,nrzctxpps01,nrzctxpcs01,nrzctxpcs02,vmnrzctxpcs02<BR>
service_description *<BR>
contact_groups citrixadmins,citrixadmins-sms<BR>
}<BR>
<BR>
--<BR>
Sascha Runschke<BR>
Netzwerk- und Systemmanagement<BR>
Telefon : +49 (201) 102-1879 Mobil : +49 (173) 5419665 Fax : +49 (201) 102-1102105<BR>
<BR>
<BR>
GFKL Financial Services AG<BR>
Vorstand: Dr. Peter Jänsch (Vors.), Jürgen Baltes, Dr. Till Ergenzinger, Dr. Tom Haverkamp<BR>
Vorsitzender des Aufsichtsrats: Dr. Georg F. Thoma<BR>
Sitz: Limbecker Platz 1, 45127 Essen, Amtsgericht Essen, HRB 13522<BR>
</FONT>
</P>
</BODY>
</HTML>