No notification on hard state change after an acknowledgement
Scott Gwartney
scott.gwartney at nwea.org
Wed Apr 30 23:06:21 CEST 2008
Sorry for the long subject and post. We're running 2.10 on CentOS 5.
When we acknowledge a service alert that goes into warning, we're not
receiving an alert when it goes into critical.
For example: we're monitoring the E drive on a file server. The drive
goes into a warning state, Nagios sends an alert, and an acknowledgement
is entered. Later the drive goes critical, but an alert is never sent.
Following are the relevant log entries and config files. Thanks for the
help!
Log File:
E drive goes into warning
Apr 29 15:10:38 DataCenterMon nagios: SERVICE NOTIFICATION: XX;X;Disk
Usage E Drive;WARNING;notify-by-epager;e:\ - total: 263.99 Gb - used:
243.89 Gb (92%) - free 20.10 Gb (8%)
E drive is acknowledged
Apr 29 15:11:26 DataCenterMon nagios: EXTERNAL COMMAND:
ACKNOWLEDGE_SVC_PROBLEM;X;Disk Usage E Drive;2;1;1;Nagios Admin;jf
Acknowledge is sent
Apr 29 15:11:26 DataCenterMon nagios: SERVICE NOTIFICATION: XX;X;Disk
Usage E Drive;ACKNOWLEDGEMENT (WARNING);notify-by-email;e:\ - total:
263.99 Gb - used: 243.89 Gb (92%) - free 20.10 Gb (8%);Nagios Admin;jf
Apr 29 15:11:27 DataCenterMon nagios: SERVICE NOTIFICATION: XX;X;Disk
Usage E Drive;ACKNOWLEDGEMENT (WARNING);notify-by-epager;e:\ - total:
263.99 Gb - used: 243.89 Gb (92%) - free 20.10 Gb (8%);Nagios Admin;jf
E drive goes critical no alert sent
Apr 30 10:07:16 DataCenterMon nagios: SERVICE ALERT: X;Disk Usage E
Drive;CRITICAL;HARD;3;e:\ - total: 263.99 Gb - used: 251.33 Gb (95%) -
free 12.67 Gb (5%)
Apr 30 11:04:16 DataCenterMon nagios: EXTERNAL COMMAND:
SCHEDULE_FORCED_SVC_CHECK;X;Disk Usage E Drive;1209578654
Acknowledgement is removed and alert is sent.
Apr 30 11:05:19 DataCenterMon nagios: EXTERNAL COMMAND:
REMOVE_SVC_ACKNOWLEDGEMENT;X;Disk Usage E Drive
Apr 30 11:05:49 DataCenterMon nagios: EXTERNAL COMMAND:
SCHEDULE_FORCED_SVC_CHECK;X;Disk Usage E Drive;1209578747
Apr 30 11:05:57 DataCenterMon nagios: SERVICE NOTIFICATION: XX;X;Disk
Usage E Drive;CRITICAL;notify-by-email;e:\ - total: 263.99 Gb - used:
254.71 Gb (96%) - free 9.29 Gb (4%)
# Host Template for Critical Hosts -- [E]Pager and Email Notification to
x 27x7
define host{
name
Critical_Host ; The name of this host template - referenced in
other host definitions, used for template recursion/resolution
notifications_enabled 1
; Host notifications are enabled
event_handler_enabled 1 ; Host
event handler is enabled
flap_detection_enabled 1 ;
Flap detection is enabled
process_perf_data 1
; Process performance data
retain_status_information 1 ;
Retain status information across program restarts
retain_nonstatus_information 1 ; Retain
non-status information across program restarts
notification_period 24x7
; Notifies 24x365
notification_options d,u,r
;Down, Up, Recovery
notification_interval 5
;Sends Page/Email every 5 minutes
check_command
check_ping!1000.0,20%!30000.0,100% ;Warns at 20% packet loss or round
trip time > 1000 MS Critical at 100% packet loss or 30000 MS roun trip
max_check_attempts 5
;Checks host 5 times before generating an alert
contact_groups x
register
0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST,
JUST A TEMPLATE!
}
# 'NWWEBNAS' host definition
define host{
use
Critical_Host ; Name of host template to use
host_name X
alias Production
File Server
address x.x.x.x
parents X
}
# Crtitical Service definition template
define service{
name
Critical_Service ; The 'name' of this service template, referenced in
other service definitions
active_checks_enabled 1
; Active service checks are enabled
passive_checks_enabled 1
; Passive service checks are enabled/accepted
parallelize_check
1 ; Active service checks should be parallelized (disabling
this can lead to major performance problems)
obsess_over_service
1 ; We should obsess over this service (if necessary)
is_volatile
0
check_freshness
0 ; Default is to NOT check service 'freshness'
notifications_enabled 1
; Service notifications are enabled
event_handler_enabled 1
; Service event handler is enabled
flap_detection_enabled 1
; Flap detection is enabled
process_perf_data
1 ; Process performance data
retain_status_information 1
; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain
non-status information across program restarts
event_handler_enabled 1
;Event handler is enabled
check_period
24x7_With_Maintenance_Window ;Checks 24x7x365
normal_check_interval 10 ;When
service is OK it will be checked every 10 minutes
max_check_attempts 3
;When service is not OK it will check 3 times before sending an alert
retry_check_interval
1 ;Retries every 1 minute once service is not OK. After
max_check_attempts has bee reached it rechecks at normal_check_interval
notification_interval 10
;Sends notifications every 10 minutes
notification_period 24x7
; Notifies 24x365
notification_options
w,u,c,r ;Sends alerts at Warning, Unreachable, Critical and Recovery
contact_groups
x ;Email ISOpsOnCall and pages ISOnCallCell
register
0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE,
JUST A TEMPLATE!
}
# Service definition
define service{
use
Critical_Service ; Name of service template to use
host_name
X
service_description
Disk Usage E Drive
check_command
check_nt_disk!e!80!95
}
Log File:
E drive goes into warning
Apr 29 15:10:38 DataCenterMon nagios: SERVICE NOTIFICATION: XX;X;Disk
Usage E Drive;WARNING;notify-by-epager;e:\ - total: 263.99 Gb - used:
243.89 Gb (92%) - free 20.10 Gb (8%)
E drive is acknowledged
Apr 29 15:11:26 DataCenterMon nagios: EXTERNAL COMMAND:
ACKNOWLEDGE_SVC_PROBLEM;X;Disk Usage E Drive;2;1;1;Nagios Admin;jf
Acknowledge is sent
Apr 29 15:11:26 DataCenterMon nagios: SERVICE NOTIFICATION: XX;X;Disk
Usage E Drive;ACKNOWLEDGEMENT (WARNING);notify-by-email;e:\ - total:
263.99 Gb - used: 243.89 Gb (92%) - free 20.10 Gb (8%);Nagios Admin;jf
Apr 29 15:11:27 DataCenterMon nagios: SERVICE NOTIFICATION: XX;X;Disk
Usage E Drive;ACKNOWLEDGEMENT (WARNING);notify-by-epager;e:\ - total:
263.99 Gb - used: 243.89 Gb (92%) - free 20.10 Gb (8%);Nagios Admin;jf
E drive goes critical no alert sent
Apr 30 10:07:16 DataCenterMon nagios: SERVICE ALERT: X;Disk Usage E
Drive;CRITICAL;HARD;3;e:\ - total: 263.99 Gb - used: 251.33 Gb (95%) -
free 12.67 Gb (5%)
Apr 30 11:04:16 DataCenterMon nagios: EXTERNAL COMMAND:
SCHEDULE_FORCED_SVC_CHECK;X;Disk Usage E Drive;1209578654
Acknowledgement is removed and alert is sent.
Apr 30 11:05:19 DataCenterMon nagios: EXTERNAL COMMAND:
REMOVE_SVC_ACKNOWLEDGEMENT;X;Disk Usage E Drive
Apr 30 11:05:49 DataCenterMon nagios: EXTERNAL COMMAND:
SCHEDULE_FORCED_SVC_CHECK;X;Disk Usage E Drive;1209578747
Apr 30 11:05:57 DataCenterMon nagios: SERVICE NOTIFICATION: XX;X;Disk
Usage E Drive;CRITICAL;notify-by-email;e:\ - total: 263.99 Gb - used:
254.71 Gb (96%) - free 9.29 Gb (4%)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20080430/ceccac95/attachment.html>
-------------- next part --------------
-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
Don't miss this year's exciting event. There's still time to save $100.
Use priority code J8TL2D2.
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list