Gaps in availability report (ie the end of a down != the start of the next up)
Stanley Hopcroft
Stanley.Hopcroft at IPAustralia.Gov.AU
Wed Feb 26 04:45:38 CET 2003
Dear Ladies and Gentlemen,
I am writing to report a perplexing observation in the avail.cgi report
from Nagios-1.0, for both a host and a service on the host.
The problem is that the report presents alternating down and up
intervals with the start and end times of each intervals. To my suprise
however, the start of the up interval after a down is not the same as
end of the down.
Formally
Start Up
Down d1 u1
Up u2 d2 ... with u1 != u2.
For example,
tsitc> lynx
'http://<nagios>/cgi-bin/avail.cgi?t1=1046143534&t2=1046229934&show_log_entries=&host=hpa-bne&service=ISDN+access+to+HPA&assumeinitialstates=yes&
assumestateretention=yes&initialassumedstate=6&backtrack=0&timeperiod=thismonth'
Event Start Time Event End Time Event Duration Event/State Type
Event/State Information
02-01-2003 00:00:00 02-03-2003 15:18:57 2d 15h 18m 57s SERVICE OK
First Service State Assumed (Faked Log Entry)
02-20-2003 20:42:08 02-21-2003 10:49:46 0d 14h 7m 38s SERVICE CRITICAL
CRITICAL - Plugin timed out after 15 seconds
02-21-2003 10:52:56 02-21-2003 12:06:15 0d 1h 13m 19s SERVICE OK
PING ok - Packet loss = 0%, RTA = 76.12 ms
02-21-2003 19:18:06 02-22-2003 18:02:13 0d 22h 44m 7s SERVICE CRITICAL
CRITICAL - Plugin timed out after 15 seconds
02-24-2003 12:29:39 02-24-2003 15:09:47 0d 2h 40m 8s SERVICE OK
PING ok - Packet loss = 0%, RTA = 436.29 ms
02-25-2003 12:46:41 02-25-2003 13:02:33 0d 0h 15m 52s SERVICE CRITICAL
CRITICAL - Plugin timed out after 15 seconds
02-25-2003 17:14:54 02-26-2003 14:30:04 0d 21h 15m 10s+ SERVICE OK
PING ok - Packet loss = 20%, RTA = 110.98 ms
(This is the service availability report but the host availability is
the same)
Nag was running all the time the service was being monitored, and the
service (check_ping) does not return UNKNOWNs.
Here is the log extract for the intervals above,
tsitc> tail -2000 nagios.log | grep hpa-bne | grep HARD |
./ns_log_localtime
Tue Feb 25 12:46:39 HOST ALERT: hpa-bne;DOWN;HARD;10;CRITICAL - Plugin
timed out after 10 seconds
Tue Feb 25 12:46:41 SERVICE ALERT: hpa-bne;ISDN access to
HPA;CRITICAL;HARD;1;CRITICAL - Plugin timed out after 15 seconds
Tue Feb 25 17:14:53 HOST ALERT: hpa-bne;UP;HARD;1;PING ok - Packet loss
= 0%, RTA = 75.09 ms
Tue Feb 25 17:14:54 SERVICE ALERT: hpa-bne;ISDN access to
HPA;OK;HARD;1;PING ok - Packet loss = 20%, RTA = 110.98 ms
tsitc>
tsitc> tail -2000 nagios.log | grep PROGRAM | ./ns_log_localtime
tsitc> tail -2000 nagios.log | grep UNKNOWN | grep hpa-bne
./ns_log_localtime
tsitc>
I expected the SERVICE CRITCAL end time in the report to be
17:14 (the start of the SERVICE OK interval) instead of 13:02.
The check_interval is 10 minutes; the log rotation period is monthly.
The Nagios log can be posted if necessary.
I haven't any research apart from searching the Nag FAQs for
'availability'.
This isn't too much of a problem as usually determine down and up times
from the log. It is suprising and unsettling to see that it can't be
done from the availability CGI and suggests that the availability
computation doesn't allow for the gap intervals.
Please let me know if this is a STFW.
Yours sincerely.
--
------------------------------------------------------------------------
Stanley Hopcroft
------------------------------------------------------------------------
'...No man is an island, entire of itself; every man is a piece of the
continent, a part of the main. If a clod be washed away by the sea,
Europe is the less, as well as if a promontory were, as well as if a
manor of thy friend's or of thine own were. Any man's death diminishes
me, because I am involved in mankind; and therefore never send to know
for whom the bell tolls; it tolls for thee...'
from Meditation 17, J Donne.
-------------------------------------------------------
This SF.net email is sponsored by: Scholarships for Techies!
Can't afford IT training? All 2003 ictp students receive scholarships.
Get hands-on training in Microsoft, Cisco, Sun, Linux/UNIX, and more.
www.ictp.com/training/sourceforge.asp
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list