Nagios performance and delayed checks
kyle.odonnell at gmail.com
kyle.odonnell at gmail.com
Thu Sep 6 15:06:15 CEST 2007
Hi Kyle,
I used the nagios tuning guide:
http://nagios.sourceforge.net/docs/2_0/tuning.html
I've also had some success with lowering the 'service_reaper_frequency'
I've also found that lowering the host and service timeouts to be useful.
--kyleo
On 9/6/07, kyle <kyle at caosdigital.com> wrote:
>
> Hi folks,
>
> I'm having some performance issues with a rather big Nagios 2.4 deployment
> (430
> servers, 3000 service checks) .
>
> I'm having now a 600-900 secs avg latency for checks - checks distribution
> is more or less like this :
>
> 450 check_icmp (scheduled every minute)
> 500 check_ifoperstatus (scheduled very 5 minutes)
> 800 check_nt (scheduled every 5-10 minutes)
> 200 check_http (scheduled every 5 minutes)
> 1000 check_by_ssh (scheduled every 5-10 minutes)
>
> 1730 Performance graphs generated by PNP
>
> I've already applied some performance tips specified in the nagios faq
> (aggregated status updates, max_concurrent_checks=0, checked hardware
> config, etc)
>
> Since load average for this server is always below 1, is there any way to
> force more concurrent checks per second? (btw, I've replicated the same
> config in a similar server with a Nagios 2.9 setup and ended with similar
> results as well)
>
> Thanks in advance :-)
>
>
> ---
>
> Output of nagiostat:
>
> Active Service Latency: 590.886 / 671.695 / 635.054 %
> Active Service Execution Time: 0.124 / 19.130 / 0.483 sec
> Active Service State Change: 0.000 / 49.930 / 0.238 %
> Active Services Last 1/5/15/60 min: 61 / 673 / 2150 / 2929
>
>
>
> Output of nagios -s nagios.cfg :
>
> HOST SCHEDULING INFORMATION
> ---------------------------
> Total hosts: 424
> Total scheduled hosts: 0
> Host inter-check delay method: SMART
> Average host check interval: 0.00 sec
> Host inter-check delay: 0.00 sec
> Max host check spread: 30 min
> First scheduled check: N/A
> Last scheduled check: N/A
>
>
> SERVICE SCHEDULING INFORMATION
> -------------------------------
> Total services: 2929
> Total scheduled services: 2929
> Service inter-check delay method: SMART
> Average service check interval: 433.36 sec
> Inter-check delay: 0.15 sec
> Interleave factor method: SMART
> Average services per host: 6.91
> Service interleave factor: 7
> Max service check spread: 30 min
> First scheduled check: Thu Sep 6 11:18:31 2007
> Last scheduled check: Thu Sep 6 11:25:45 2007
>
>
>
>
>
> --
> Windows macht frei!
>
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems? Stop.
> Now Search log events and configuration files using AJAX and a browser.
> Download your FREE copy of Splunk now >> http://get.splunk.com/
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting
> any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list