BUG: Recovery notifications sent to contacts which never received the initial problem notification
Sascha.Runschke at gfkl.com
Sascha.Runschke at gfkl.com
Wed Aug 20 16:25:08 CEST 2008
Greetings,
it seems I triggered a bug with our new nagios instance, as it shows quite
a strange behaviour.
Quoting from the nagios 3.x documentation:
http://nagios.sourceforge.net/docs/3_0/notifications.html
Service and Host Filters:
"Note: Notifications about host or service recoveries are only sent out if
a notification was sent out
for the original problem. It doesn't make sense to get a recovery
notification for something you never
knew was a problem... "
This is what happened:
1. Service went CRITICAL -> Notifications to the contacts user1-mail,
user2-mail
2. Service went WARNING -> Notifications to the contacts user1-mail,
user2-mail
3. Service went OK -> Notifications to the contacts
user1-mail,user2-mail,user1-sms,user2-sms
vmctx02 CPU CRITICAL 18-08-2008 16:24:50 user1-mail
mail-notification CRITICAL: 15m: average load 100% critical
vmctx02 CPU CRITICAL 18-08-2008 16:24:50 user2-mail
mail-notification CRITICAL: 15m: average load 100% critical
vmctx02 CPU WARNING 18-08-2008 16:31:50 user1-mail
mail-notification WARNING: 15m: average load 99% warning
vmctx02 CPU WARNING 18-08-2008 16:31:50 user2-mail
mail-notification WARNING: 15m: average load 99% warning
vmctx02 CPU OK 18-08-2008 16:32:50 user1-sms sms-notification
OK: 15m: average load 92%
vmctx02 CPU OK 18-08-2008 16:32:50 user2-sms sms-notification
OK: 15m: average load 92%
vmctx02 CPU OK 18-08-2008 16:32:50 user1-mail
mail-notification OK: 15m: average load 92%
vmctx02 CPU OK 18-08-2008 16:32:50 user2-mail
mail-notification OK: 15m: average load 92%
I do not understand why the 2 sms contacts were notified, they never
received a
problem notification in first place. It was an escalation which triggered
those sms -
but it shouldn't have in my opinion. It seems it only happens in our
environment, if
exactly 2 notifications were sent before a recovery.
These are the relevant configs:
Contacts and Templates (user1 and user2 are identical):
define contact {
name generic-contact-mail
host_notification_period 24x7
service_notification_period 24x7
host_notification_options d,r
service_notification_options u,c,w,r
host_notification_commands mail-notification
service_notification_commands mail-notification
register 0
}
define contact {
contact_name user1-mail
use generic-contact-mail
alias User1
email user1 at firma.com
}
define contact {
name generic-contact-sms
host_notification_period 24x7
service_notification_period 24x7
host_notification_options d,r
service_notification_options u,c,r
host_notification_commands sms-notification
service_notification_commands sms-notification
register 0
}
define contact {
contact_name user1-sms
use generic-contact-sms
alias S R
pager +49-DONT-CALL-ME
}
Service Templates and Service:
define service {
name generic-service
is_volatile 0
check_period 24x7
max_check_attempts 3
normal_check_interval 1
retry_check_interval 3
active_checks_enabled 1
passive_checks_enabled 1
parallelize_check 1
obsess_over_service 0
check_freshness 1
freshness_threshold 120
notifications_enabled 1
notification_interval 60
notification_period 24x7
notification_options u,c,w,r
event_handler_enabled 1
flap_detection_enabled 1
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
register 0
}
define service {
service_description CPU
use generic-service
host_name vmctx01
check_command check_nrpe_cpu!99%!100%
}
Service Escalation Templates and Escalations: (the escalation_period at
that time was workhours)
define serviceescalation {
name service-minor-nonworkhours
first_notification 4
last_notification 4
notification_interval 60
escalation_period nonworkhours
escalation_options r,c
register 0
}
define serviceescalation {
name service-minor-workhours
first_notification 2
last_notification 2
notification_interval 60
escalation_period workhours
escalation_options r,c
register 0
}
define serviceescalation {
use service-minor-nonworkhours
host_name
essctxsir06,essctx10,essctx04,essctxulg04,essctx11,essctxulg03,essctxsir03,essctxj0
1,essctxsir02,essctx03,essctxb06,essctxulg02,essctxsir05,essctxb01,essctxulg05,essctxulg01,essctx07,essctxulg06,essctxtest0
1,essctxtest01a,vmctx01,vmctx02,vmctx03,vmctx05,vmnrzctxulg03,vmnrzctxulg02,vmnrzctxulg01,nrzctxsir02,nrzctxsir01,nrzctxpps
02,nrzctxpps01,nrzctxpcs01,nrzctxpcs02,vmnrzctxpcs02
service_description *
contact_groups citrixadmins,citrixadmins-sms
}
define serviceescalation {
use service-minor-workhours
host_name
essctxsir06,essctx10,essctx04,essctxulg04,essctx11,essctxulg03,essctxsir03,essctxj0
1,essctxsir02,essctx03,essctxb06,essctxulg02,essctxsir05,essctxb01,essctxulg05,essctxulg01,essctx07,essctxulg06,essctxtest0
1,essctxtest01a,vmctx01,vmctx02,vmctx03,vmctx05,vmnrzctxulg03,vmnrzctxulg02,vmnrzctxulg01,nrzctxsir02,nrzctxsir01,nrzctxpps
02,nrzctxpps01,nrzctxpcs01,nrzctxpcs02,vmnrzctxpcs02
service_description *
contact_groups citrixadmins,citrixadmins-sms
}
--
Sascha Runschke
Netzwerk- und Systemmanagement
Telefon : +49 (201) 102-1879 Mobil : +49 (173) 5419665 Fax : +49 (201)
102-1102105
GFKL Financial Services AG
Vorstand: Dr. Peter Jänsch (Vors.), Jürgen Baltes, Dr. Till Ergenzinger, Dr. Tom Haverkamp
Vorsitzender des Aufsichtsrats: Dr. Georg F. Thoma
Sitz: Limbecker Platz 1, 45127 Essen, Amtsgericht Essen, HRB 13522
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20080820/2051b165/attachment.html>
-------------- next part --------------
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
-------------- next part --------------
_______________________________________________
Nagios-devel mailing list
Nagios-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-devel
More information about the Developers
mailing list