False alerts on http service
francis picabia
fpicabia at gmail.com
Wed Sep 12 19:03:16 CEST 2012
We have used nagios successfully for many years and never seen
a case like this. I cannot get nagios sevice to see the remote
http service is up, although the check command indicates it is up
and the remote apache log shows nagios visited with no error.
The site to monitor runs webwork, a math quiz system. I have it
set to redirect / to /webwork and also redirect insecure to https.
At first I did a plain check_http.
I switched to -S option and added -u with the full URL to avoid hitting
the redirects, so I can get a clean code 200 returned, in case that
was muddling things. No difference.
When I look at the apache log, I can see the visits from nagios,
For the early morning visits, there is no one
using the system, so it can't be unresponsive.
Here is my check command:
# 'check_www_ssl' command definition
define command{
command_name check_www_ssl
command_line $USER1$/check_http -S -I $HOSTADDRESS$ -f
follow -w 5 -c 20 -t 60 -u $ARG1$
}
Here is my service:
define service{
use generic-service
host_name webwork
is_volatile 0
service_description Webwork Web Service
check_command
check_www_ssl!'https://webwork.example.com/webwork/'
check_period 24x7
contact_groups unix-admins
max_check_attempts 3
normal_check_interval 3
retry_check_interval 1
notification_interval 120
notification_period 24x7
notification_options w,u,c,r
}
Of course I have changed the actual domain to example.com in the above.
The alert report:
***** Nagios 3.2 *****
Notification Type: PROBLEM
Host: webwork
State: DOWN
Address: 131.162.201.91
Info: Server answer:
Date/Time: Wed Sept 12 06:59:04 ADT 2012
Here is a sample visit from nagios in the webwork apache log file
before this time.
XXX.YYY.2.50 - - [12/Sep/2012:06:58:50 -0300] "GET
https://webwork.acadiau.ca/webwork/ HTTP/1.0" 200 5015 "-"
"check_http/v1.4.14 (nagios-plugins 1.4.14)"
Our apache logs show nagios is visiting every 3 minutes, 24 hours a day. None
of these visits results in an error.
In a nagios log, this is all that appears for webwork for the day:
# grep webwork nagios-09-11-2012-00.log
[1347246000] CURRENT HOST STATE: webwork;DOWN;HARD;1;Server answer:
[1347246000] CURRENT SERVICE STATE: webwork;Webwork Web
Service;OK;HARD;1;HTTP OK: HTTP/1.1 200 OK - 4053 bytes in 0.274
second response time
[1347270994] HOST NOTIFICATION:
david;webwork;DOWN;host-notify-by-email;Server answer:
[1347270994] HOST NOTIFICATION:
bob;webwork;DOWN;host-notify-by-email;Server answer:
[1347270994] HOST NOTIFICATION:
winston;webwork;DOWN;host-notify-by-email;Server answer:
[1347270994] HOST NOTIFICATION:
larry;webwork;DOWN;host-notify-by-email;Server answer:
[1347299794] HOST NOTIFICATION:
david;webwork;DOWN;host-notify-by-email;Server answer:
[1347299794] HOST NOTIFICATION:
bob;webwork;DOWN;host-notify-by-email;Server answer:
[1347299794] HOST NOTIFICATION:
winston;webwork;DOWN;host-notify-by-email;Server answer:
[1347299794] HOST NOTIFICATION:
larry;webwork;DOWN;host-notify-by-email;Server answer:
[1347328594] HOST NOTIFICATION:
david;webwork;DOWN;host-notify-by-email;Server answer:
[1347328595] HOST NOTIFICATION:
bob;webwork;DOWN;host-notify-by-email;Server answer:
[1347328595] HOST NOTIFICATION:
winston;webwork;DOWN;host-notify-by-email;Server answer:
[1347328595] HOST NOTIFICATION:
larry;webwork;DOWN;host-notify-by-email;Server answer:
If I do the check_http manually, I seem to get through fine:
# /usr/lib/nagios3.2/libexec/check_http 0-S -I webwork -f follow -w5
-c 20 -t 60 -u https://webwork.example.com/webwork
HTTP OK: HTTP/1.1 200 OK - 5162 bytes in 0.025 second response time
|time=0.024700s;5.000000;20.000000;0.000000 size=5162B;;;0
Can anyone spot a reason why this alert is not set up properly or
there is a better way to do it?
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list