question about recovery messages
Paul Lynch
Paul_Lynch at lenox.com
Tue Jun 14 17:50:43 CEST 2011
Hi Everyone,
I just joined the forum today, so I will appologize up front for the
somewhat basic nature of my question. I've not been able to find
anything about it yet, it's possible I haven't spent enough time
searching for my answer, but if someone can point me in the right
direction it would be appreciated.
So I have been running Nagios for well over a year in a very limited
capacity in my environment. I basically installed it originally as
3.0.3 and set up about a dozen windows servers to monitor CPU, memory
and disk utilization. For this it has been great.
I knew there was so much more Nagios could help with so I've been
looking for opportunities that it can add value to support of our
infrastructure. A few weeks ago someone in our web group complained
about constantly having to monitor our website to see if it is up or
not, as there have been some stability issues with it, and it runs on
six load balanced web servers. I suggested a nagios service check.
So I am using check_website_response by Chris Freeman from the exchange,
and every now and then I get critical messages, but then I never get a
recovery on the critical message, or I would expect based on my current
settings that I would get a reminder an hour later and I don't.
I am just curious to know if anyone else has inconsistencies with email
alerts on state changes?
Thanks in advance.
-Paul
-------------------------- IP addresses and names have been changed to
protect the innocent.....
RESPONSE: CRITICAL - http://10.1.0.131 does not contain any data
My template looks like this:
#=======================================================================
=======
# Service Templates
#-----------------------------------------------------------------------
-------
define service{
name NWKService
register 0 ; DONT REGISTER THIS DEFINITION - ITS
NOT A REAL SERVICE, JUST A TEMPLATE!
active_checks_enabled 1 ; Active service checks are enabled
passive_checks_enabled 1 ; Passive service checks are
enabled/accepted
parallelize_check 1 ; Active service checks should be
parallelized
obsess_over_service 1 ; We should obsess over this service
(if necessary)
check_freshness 0 ; Default is to NOT check service
'freshness'
notifications_enabled 1 ; Service notifications are enabled
event_handler_enabled 1 ; Service event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
failure_prediction_enabled 1 ; Failure prediction is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across
program restarts
retain_nonstatus_information 1 ; Retain non-status information across
program restarts
is_volatile 0 ; The service is not volatile
check_period 24x7 ; The service can be checked at any
time of the day
max_check_attempts 3 ; Re-check the service up to 3 times in
order to determine its final (hard) state
normal_check_interval 1 ; Check the service every 10 minutes
under normal conditions
retry_check_interval 1 ; Re-check the service every two
minutes until a hard state can be determined
contact_groups websiteresponse; Notifications get sent out to
everyone in the 'admins' group
notification_options u,c,r ; Send notifications about warning,
unknown, critical, and recovery events
notification_interval 120 ; Re-notify about service problems
every hour
notification_period 24x7 ; Notifications can be sent out at any
time
}
############################## Episode3
define host{
use NwkHost
host_name Episode3
alias Episode3
address Episode3
parents 3524,3524B
}
define service{
use NWKService
host_name Episode3
service_description Response Time - Homepage
servicegroups www-response-time
check_command
check_website_response!"http://10.1.0.131/"!5000!30000
}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20110614/14fb2305/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list