Service Went Down, No notification sent...
John McGowan
mcgowan at lynch2.com
Thu Aug 11 17:36:42 CEST 2005
I'm not sure exactly where to start on this.... last night i checked on
my services and noticed a service that was down and had been down for
about 20 minutes. No notification was ever sent out... this is what I
saw in the log...
[08-10-2005 22:07:36] SERVICE ALERT: tessweb;Tessitura
SeatServer;CRITICAL;HARD;1;CRITICAL - Socket timeout after 10 seconds
When I ran a test this morning this is what I saw in the event log.
[08-11-2005 10:17:39] SERVICE ALERT: tessweb;Tessitura
SeatServer;CRITICAL;SOFT;1;No route to host
[08-11-2005 10:18:08] SERVICE ALERT: tessweb;Tessitura
SeatServer;CRITICAL;SOFT;2;No route to host
[08-11-2005 10:18:38] SERVICE ALERT: tessweb;Tessitura
SeatServer;CRITICAL;SOFT;3;No route to host
[08-11-2005 10:19:08] SERVICE ALERT: tessweb;Tessitura
SeatServer;CRITICAL;SOFT;4;No route to host
[08-11-2005 10:19:38] SERVICE ALERT: tessweb;Tessitura
SeatServer;CRITICAL;HARD;5;No route to host
The first thing that stood out when I saw it was the fact that the max
check attempts didn't seem to make a difference last night... the
service went critical hard on the first failure...
FYI: the particular thing that caused this failure last night was that
the VPN between here and there was down.
the service definition is listed below... with it's template
define service{
name generic-service ; The 'name' of this service
template, referenced in other service definitions
active_checks_enabled 1 ; Active service checks are enabled
passive_checks_enabled 1 ; Passive service checks are
enabled/accepted
parallelize_check 1 ; Active service checks should be
parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 ; We should obsess over this service
(if necessary)
check_freshness 0 ; Default is to NOT check service
'freshness'
notifications_enabled 1 ; Service notifications are enabled
event_handler_enabled 1 ; Service event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across
program restarts
retain_nonstatus_information 1 ; Retain non-status information
across program restarts
is_volatile 0
check_period 24x7
notification_period 24x7
notification_interval 120
notification_options w,u,c,r
contact_groups admins
max_check_attempts 5
normal_check_interval 60
retry_check_interval 30
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A
REAL SERVICE, JUST A TEMPLATE!
}
# Service definition
define service{
use generic-service ; Name of service template
to use
contact_groups csoadmins
host_name tessweb
service_description Tessitura SeatServer
check_command
check_http_site2_ssl!tessweb.cso.org!true!/Tessitura.asmx/WebSeatServerListening
}
-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list