<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<META NAME="Generator" CONTENT="MS Exchange Server version 6.5.7226.0">
<TITLE>RE: [Nagios-users] Question about "Last Check" fields.</TITLE>
</HEAD>
<BODY>
<!-- Converted from text/plain format -->
<P><FONT SIZE=2>> I notice that even though I have all of my checks running every 5<BR>
minutes,<BR>
> the Last Check field in nagios will sometimes be several days out of<BR>
> date..<BR>
> What can I do to force nagio to be more accurate in that field? It has<BR>
> raised some doubt amongst management as to if Nagios is really working<BR>
or<BR>
> not.<BR>
<BR>
Last Check data is updated at the same time the status information is<BR>
(i.e. when checks are performed) so they should always be accurate. Do<BR>
you have any orphaned check processes? Are you allowing enough<BR>
concurrent checks to be run (max_concurrent_checks)? Are you reaping<BR>
service check results often (service_reaper_frequency). I don't<BR>
understand why the information would be days old in either case. Perhaps<BR>
you _might_ have multiple daemons running? More information on your<BR>
installation, number of hosts and services and the type of hardware<BR>
might be helpful. Output of /path/to/nagios -s /path/to/nagios.cfg would<BR>
be informative as well.<BR>
<BR>
<BR>
Marc,<BR>
<BR>
In response to your questions :<BR>
<BR>
<BR>
I'm not seeing any orphaned checks.<BR>
max_concurrent_checks = 0<BR>
service_reaper_frequency = 5<BR>
42 hosts<BR>
346 services<BR>
(I'm about to triple the # of hosts and services, though with a new rollout<BR>
this weekend)<BR>
<BR>
It's running on a dell 1750, dual xeon 2.4ghz server with 2GB of memory, and 3 73GB drives,<BR>
hardware raid5.<BR>
<BR>
Output of nagios -s :<BR>
<BR>
-----------------------<BR>
<BR>
<BR>
Nagios 1.2<BR>
Copyright (c) 1999-2004 Ethan Galstad (nagios@nagios.org)<BR>
Last Modified: 02-02-2004<BR>
License: GPL<BR>
<BR>
SERVICE SCHEDULING INFORMATION<BR>
-------------------------------<BR>
Total services: 350<BR>
Total hosts: 44<BR>
<BR>
Command check interval: 10 sec<BR>
Check reaper interval: 5 sec<BR>
<BR>
Inter-check delay method: SMART<BR>
Average check interval: 120.857 sec<BR>
Inter-check delay: 0.345 sec<BR>
<BR>
Interleave factor method: SMART<BR>
Average services per host: 7.955<BR>
Service interleave factor: 8<BR>
<BR>
Initial service check scheduling info:<BR>
--------------------------------------<BR>
First scheduled check: 1110568976 -> Fri Mar 11 11:22:56 2005<BR>
Last scheduled check: 1110569097 -> Fri Mar 11 11:24:57 2005<BR>
<BR>
Rough guidelines for max_concurrent_checks value:<BR>
-------------------------------------------------<BR>
Absolute minimum value: 15<BR>
Recommend value: 45<BR>
<BR>
Notes:<BR>
The recommendations for the max_concurrent_checks value<BR>
assume that the average execution time for service<BR>
checks is less than the service check reaper interval.<BR>
The minimum value also reflects best case scenarios<BR>
where there are no problems on your network. You will<BR>
have to tweak this value as necessary after testing.<BR>
High latency values for checks are often indicative of<BR>
the max_concurrent_checks value being set too low and/or<BR>
the service_reaper_frequency being set too high.<BR>
It is important to note that the values displayed above<BR>
do not reflect current performance information for any<BR>
Nagios process that may currently be running. They are<BR>
provided solely to project expected and recommended<BR>
values based on the current data in the config files.<BR>
</FONT>
</P>
</BODY>
</HTML>