<font size=2 face="sans-serif">Hi,</font>
<br>
<br><font size=2 face="sans-serif">I have a Nagios 3.2.3 deployment with
1000+ Hosts and 3000+ services. This Nagios runs together with NDO and
PNP (in bulk mode) in a server with 4GB of Ram and 4 cpus.</font>
<br>
<br><font size=2 face="sans-serif">One day I realized that the check delay
in the performance CGI was very high (300-400 seconds). It was very strange
so I took the tunning guide form nagios (</font><a href=http://nagios.sourceforge.net/docs/3_0/tuning.html><font size=2 face="sans-serif">http://nagios.sourceforge.net/docs/3_0/tuning.html</font></a><font size=2 face="sans-serif">)
and applied all the points I could. In particular I adjusted the max_concurrent_checks
to zero (no limit):</font>
<br>
<br><font size=2 face="sans-serif">max_concurrent_checks=0</font>
<br>
<br><font size=2 face="sans-serif">The reaper event:</font>
<br>
<br><font size=2 face="sans-serif">service_reaper_frequency=5</font>
<br><font size=2 face="sans-serif">max_check_result_reaper_time=15</font>
<br>
<br><font size=2 face="sans-serif">and checked that the host checks where
not forced. In addition I configured 15 seconds of host check cache.</font>
<br>
<br><font size=2 face="sans-serif">cached_host_check_horizon=15</font>
<br>
<br><font size=2 face="sans-serif">But the problem remains. And the load
of the server is not very high. Load of 2,5, 2 GB of free memory and an
average utilization of disc of 7%. I disabled NDO and PNP but it was useless.
After the first round of checks, the delay returns, while the load of the
server doesn't grow.</font>
<br>
<br><font size=2 face="sans-serif">I have searched in google but all the
problems area because of the load in the server, but here this is not the
main problem. So my question is ¿what can I do now?¿There is some variable
that shows me where to look? I'm a bit lost right now and I don't know
how to find the problem.</font>
<br>
<br><font size=2 face="sans-serif">¿Or maybe the only way is to configure
a master-slave nagios in order to maximize the server utilization?</font>
<br>
<br><font size=2 face="sans-serif">In addition, I have pretty big timeouts
(60 seconds) because of the high latency on the network. All your help
is appreciated. Thank you in advance.</font>
<br>
<br><font size=3 face="sans-serif"><b>nagiostats</b></font>
<br><font size=2 face="sans-serif">Nagios Stats 3.2.3</font>
<br><font size=2 face="sans-serif">Copyright (c) 2003-2008 Ethan Galstad
(</font><a href=www.nagios.org><font size=2 face="sans-serif">www.nagios.org</font></a><font size=2 face="sans-serif">)</font>
<br><font size=2 face="sans-serif">Last Modified: 10-03-2010</font>
<br><font size=2 face="sans-serif">License: GPL</font>
<br>
<br><font size=2 face="sans-serif">CURRENT STATUS DATA</font>
<br><font size=2 face="sans-serif">------------------------------------------------------</font>
<br><font size=2 face="sans-serif">Status File:
/usr/local/argos/aplicaciones/nagios/var/status.dat</font>
<br><font size=2 face="sans-serif">Status File Age:
0d 0h 0m
11s</font>
<br><font size=2 face="sans-serif">Status File Version:
3.2.3</font>
<br>
<br><font size=2 face="sans-serif">Program Running Time:
0d 20h 56m 7s</font>
<br><font size=2 face="sans-serif">Nagios PID:
21834</font>
<br><font size=2 face="sans-serif">Used/High/Total Command Buffers:
0 / 0 / 4096</font>
<br>
<br><font size=2 face="sans-serif">Total Services:
4032</font>
<br><font size=2 face="sans-serif">Services Checked:
4032</font>
<br><font size=2 face="sans-serif">Services Scheduled:
4030</font>
<br><font size=2 face="sans-serif">Services Actively Checked:
4032</font>
<br><font size=2 face="sans-serif">Services Passively Checked:
0</font>
<br><font size=2 face="sans-serif">Total Service State Change:
0.000 / 37.300 / 0.163 %</font>
<br><font size=2 face="sans-serif">Active Service Latency:
32.876 / 442.138 / 415.816 sec</font>
<br><font size=2 face="sans-serif">Active Service Execution Time:
0.051 / 60.097 / 1.545 sec</font>
<br><font size=2 face="sans-serif">Active Service State Change:
0.000 / 37.300 / 0.163 %</font>
<br><font size=2 face="sans-serif">Active Services Last 1/5/15/60 min:
237 / 1530 / 4020 / 4020</font>
<br><font size=2 face="sans-serif">Passive Service Latency:
0.000 / 0.000 / 0.000 sec</font>
<br><font size=2 face="sans-serif">Passive Service State Change:
0.000 / 0.000 / 0.000 %</font>
<br><font size=2 face="sans-serif">Passive Services Last 1/5/15/60 min:
0 / 0 / 0 / 0</font>
<br><font size=2 face="sans-serif">Services Ok/Warn/Unk/Crit:
3766 / 38 / 44 / 184</font>
<br><font size=2 face="sans-serif">Services Flapping:
0</font>
<br><font size=2 face="sans-serif">Services In Downtime:
0</font>
<br>
<br><font size=2 face="sans-serif">Total Hosts:
931</font>
<br><font size=2 face="sans-serif">Hosts Checked:
931</font>
<br><font size=2 face="sans-serif">Hosts Scheduled:
931</font>
<br><font size=2 face="sans-serif">Hosts Actively Checked:
931</font>
<br><font size=2 face="sans-serif">Host Passively Checked:
0</font>
<br><font size=2 face="sans-serif">Total Host State Change:
0.000 / 12.370 / 0.077 %</font>
<br><font size=2 face="sans-serif">Active Host Latency:
0.000 / 441.308 / 416.063
sec</font>
<br><font size=2 face="sans-serif">Active Host Execution Time:
0.062 / 10.113 / 0.395 sec</font>
<br><font size=2 face="sans-serif">Active Host State Change:
0.000 / 12.370 / 0.077 %</font>
<br><font size=2 face="sans-serif">Active Hosts Last 1/5/15/60 min:
74 / 423 / 931 / 931</font>
<br><font size=2 face="sans-serif">Passive Host Latency:
0.000 / 0.000 / 0.000
sec</font>
<br><font size=2 face="sans-serif">Passive Host State Change:
0.000 / 0.000 / 0.000 %</font>
<br><font size=2 face="sans-serif">Passive Hosts Last 1/5/15/60 min:
0 / 0 / 0 / 0</font>
<br><font size=2 face="sans-serif">Hosts Up/Down/Unreach:
897 / 24 / 10</font>
<br><font size=2 face="sans-serif">Hosts Flapping:
0</font>
<br><font size=2 face="sans-serif">Hosts In Downtime:
1</font>
<br>
<br><font size=2 face="sans-serif">Active Host Checks Last 1/5/15 min:
109 / 535 / 1583</font>
<br><font size=2 face="sans-serif"> Scheduled:
87 / 433 / 1300</font>
<br><font size=2 face="sans-serif"> On-demand:
22 / 102 / 283</font>
<br><font size=2 face="sans-serif"> Parallel:
87 / 438 / 1323</font>
<br><font size=2 face="sans-serif"> Serial:
0 / 0 / 0</font>
<br><font size=2 face="sans-serif"> Cached:
22 / 97 / 260</font>
<br><font size=2 face="sans-serif">Passive Host Checks Last 1/5/15 min:
0 / 0 / 0</font>
<br><font size=2 face="sans-serif">Active Service Checks Last 1/5/15 min:
304 / 1605 / 4924</font>
<br><font size=2 face="sans-serif"> Scheduled:
304 / 1605 / 4923</font>
<br><font size=2 face="sans-serif"> On-demand:
0 / 0 / 1</font>
<br><font size=2 face="sans-serif"> Cached:
0 / 0 / 0</font>
<br><font size=2 face="sans-serif">Passive Service Checks Last 1/5/15 min:
0 / 0 / 0</font>
<br>
<br><font size=2 face="sans-serif">External Commands Last 1/5/15 min:
0 / 0 / 0</font>
<br>
<br><font size=3 face="sans-serif"><b>nagios -s</b></font>
<br>
<br><font size=2 face="sans-serif">Nagios Core 3.2.3</font>
<br><font size=2 face="sans-serif">Copyright (c) 2009-2010 Nagios Core
Development Team and Community Contributors</font>
<br><font size=2 face="sans-serif">Copyright (c) 1999-2009 Ethan Galstad</font>
<br><font size=2 face="sans-serif">Last Modified: 10-03-2010</font>
<br><font size=2 face="sans-serif">License: GPL</font>
<br>
<br><font size=2 face="sans-serif">Website: </font><a href=http://www.nagios.org/><font size=2 face="sans-serif">http://www.nagios.org</font></a>
<br><font size=2 face="sans-serif">Warning: aggregate_status_updates directive
ignored. All status file updates are now aggregated.</font>
<br><font size=2 face="sans-serif">Warning: downtime_file variable ignored.
Downtime entries are now stored in the status and retention files.</font>
<br><font size=2 face="sans-serif">Warning: comment_file variable ignored.
Comments are now stored in the status and retention files.</font>
<br><font size=2 face="sans-serif">Timing information on object configuration
processing is listed</font>
<br><font size=2 face="sans-serif">below. You can use this information
to see if precaching your</font>
<br><font size=2 face="sans-serif">object configuration would be useful.</font>
<br>
<br><font size=2 face="sans-serif">Object Config Source: Config files (uncached)</font>
<br>
<br><font size=2 face="sans-serif">OBJECT CONFIG PROCESSING TIMES
(* = Potential for precache savings with -u option)</font>
<br><font size=2 face="sans-serif">----------------------------------</font>
<br><font size=2 face="sans-serif">Read:
0.080036 sec</font>
<br><font size=2 face="sans-serif">Resolve:
0.010660 sec *</font>
<br><font size=2 face="sans-serif">Recomb Contactgroups: 0.002666 sec *</font>
<br><font size=2 face="sans-serif">Recomb Hostgroups: 0.004086
sec *</font>
<br><font size=2 face="sans-serif">Dup Services:
0.034632 sec *</font>
<br><font size=2 face="sans-serif">Recomb Servicegroups: 0.001277 sec *</font>
<br><font size=2 face="sans-serif">Duplicate:
0.010939 sec *</font>
<br><font size=2 face="sans-serif">Inherit:
0.005594 sec *</font>
<br><font size=2 face="sans-serif">Recomb Contacts: 0.000001
sec *</font>
<br><font size=2 face="sans-serif">Sort:
0.000000 sec *</font>
<br><font size=2 face="sans-serif">Register:
0.074413 sec</font>
<br><font size=2 face="sans-serif">Free:
0.008730 sec</font>
<br><font size=2 face="sans-serif">
============</font>
<br><font size=2 face="sans-serif">TOTAL:
0.234920 sec * = 0.071741 sec (30.54%) estimated
savings</font>
<br>
<br>
<br><font size=2 face="sans-serif">RETENTION DATA TIMES</font>
<br><font size=2 face="sans-serif">----------------------------------</font>
<br><font size=2 face="sans-serif">Read and Process: 0.495480
sec</font>
<br><font size=2 face="sans-serif">
============</font>
<br><font size=2 face="sans-serif">TOTAL:
0.495480 sec</font>
<br>
<br>
<br><font size=2 face="sans-serif">Timing information on configuration
verification is listed below.</font>
<br>
<br><font size=2 face="sans-serif">CONFIG VERIFICATION TIMES
(* = Potential for speedup with -x option)</font>
<br><font size=2 face="sans-serif">----------------------------------</font>
<br><font size=2 face="sans-serif">Object Relationships: 0.060039 sec</font>
<br><font size=2 face="sans-serif">Circular Paths:
0.026557 sec *</font>
<br><font size=2 face="sans-serif">Misc:
0.005999 sec</font>
<br><font size=2 face="sans-serif">
============</font>
<br><font size=2 face="sans-serif">TOTAL:
0.092595 sec * = 0.026557 sec (28.7%) estimated
savings</font>
<br>
<br>
<br><font size=2 face="sans-serif">EVENT SCHEDULING TIMES</font>
<br><font size=2 face="sans-serif">-------------------------------------</font>
<br><font size=2 face="sans-serif">Get service info:
0.014509 sec</font>
<br><font size=2 face="sans-serif">Get host info info: 0.002853
sec</font>
<br><font size=2 face="sans-serif">Get service params: 0.000078
sec</font>
<br><font size=2 face="sans-serif">Schedule service times: 0.039947
sec</font>
<br><font size=2 face="sans-serif">Schedule service events: 0.034656 sec</font>
<br><font size=2 face="sans-serif">Get host params:
0.000001 sec</font>
<br><font size=2 face="sans-serif">Schedule host times: 0.007519
sec</font>
<br><font size=2 face="sans-serif">Schedule host events: 0.029519
sec</font>
<br><font size=2 face="sans-serif">
============</font>
<br><font size=2 face="sans-serif">TOTAL:
0.129082 sec</font>
<br>
<br>
<br><font size=2 face="sans-serif">Projected scheduling information for
host and service checks</font>
<br><font size=2 face="sans-serif">is listed below. This information
assumes that you are going</font>
<br><font size=2 face="sans-serif">to start running Nagios with your current
config files.</font>
<br>
<br><font size=2 face="sans-serif">HOST SCHEDULING INFORMATION</font>
<br><font size=2 face="sans-serif">---------------------------</font>
<br><font size=2 face="sans-serif">Total hosts:
931</font>
<br><font size=2 face="sans-serif">Total scheduled hosts:
931</font>
<br><font size=2 face="sans-serif">Host inter-check delay method:
SMART</font>
<br><font size=2 face="sans-serif">Average host check interval:
259.01 sec</font>
<br><font size=2 face="sans-serif">Host inter-check delay:
0.28 sec</font>
<br><font size=2 face="sans-serif">Max host check spread:
30 min</font>
<br><font size=2 face="sans-serif">First scheduled check:
Tue Oct 11 13:14:08 2011</font>
<br><font size=2 face="sans-serif">Last scheduled check:
Tue Oct 11 13:18:26 2011</font>
<br>
<br>
<br><font size=2 face="sans-serif">SERVICE SCHEDULING INFORMATION</font>
<br><font size=2 face="sans-serif">-------------------------------</font>
<br><font size=2 face="sans-serif">Total services:
4032</font>
<br><font size=2 face="sans-serif">Total scheduled services:
4030</font>
<br><font size=2 face="sans-serif">Service inter-check delay method:
SMART</font>
<br><font size=2 face="sans-serif">Average service check interval:
299.55 sec</font>
<br><font size=2 face="sans-serif">Inter-check delay:
0.07 sec</font>
<br><font size=2 face="sans-serif">Interleave factor method:
SMART</font>
<br><font size=2 face="sans-serif">Average services per host:
4.33</font>
<br><font size=2 face="sans-serif">Service interleave factor:
5</font>
<br><font size=2 face="sans-serif">Max service check spread:
30 min</font>
<br><font size=2 face="sans-serif">First scheduled check:
Tue Oct 11 13:15:07 2011</font>
<br><font size=2 face="sans-serif">Last scheduled check:
Tue Oct 11 13:20:07 2011</font>
<br>
<br>
<br><font size=2 face="sans-serif">CHECK PROCESSING INFORMATION</font>
<br><font size=2 face="sans-serif">----------------------------</font>
<br><font size=2 face="sans-serif">Check result reaper interval:
5 sec</font>
<br><font size=2 face="sans-serif">Max concurrent service checks:
Unlimited</font>
<br>
<br>
<br><font size=2 face="sans-serif">PERFORMANCE SUGGESTIONS</font>
<br><font size=2 face="sans-serif">-----------------------</font>
<br><font size=2 face="sans-serif">I have no suggestions - things look
okay.</font>
<br><tt><font size=3>-- <br>
Javier Vela Diago<br>
S2 GRUPO<br>
Ramiro de Maeztu, 7 bajo. 46022 Valencia<br>
Tel: 963.110.300 Fax: 963.106.086<br>
e-mail : jvela arroba s2grupo punto es<br>
</font></tt><a href=http://www.s2grupo.es/><tt><font size=3 color=blue><u>http://www.s2grupo.es</u></font></tt></a>