massive service check latencies
Ben
bench at silentmedia.com
Mon Mar 21 23:00:09 CET 2005
I've been having a horrible time with service check latencies. I've got
~6k services so I thought at first maybe my hardware couldn't keep up.
But after moving to much beefier hardware, things have actually gotten
worse, not better. So I figured, I'd been running a recent beta...
maybe one of the new checkins fixed something. I tried to pull down the
latest from CVS this morning, and it has the same situation.
So now I think I just have a basic misunderstanding of the way nagios
schedules checks. Here's how I've tweaked my settings to try to make
things run more frequently:
service_inter_check_delay_method=n
max_service_check_spread=60
service_interleave_factor=s
host_inter_check_delay_method=n
max_host_check_spread=60
max_concurrent_checks=0
service_reaper_frequency=5
What I notice is that checks are queued up several dozen at a time, and
that they all have to finish before the next batch can begin. As far as I
can tell, there is no way to make the size of the batch grow, or to stop
waiting for all checks to finish before moving on. The hardware (dual 2.8
xeon with 2.5GB of ram dedicated to monitoring) is not at all stressed.
Interestingly, while my service check latencies average around 500
seconds, my host check latencies are well under 1 second, which is what I
would expect. FWIW, I've got about 2300 hosts.
Oh, and the average execution time for both service and host checks is
about 3 seconds.
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list