Nagios and Gearman - huge environment performance problem
Rodney Ramos
rodneyra at gmail.com
Wed Aug 24 16:26:07 CEST 2011
Hi Sven. Thank you again. I´m pretty sure that my check interval is 15 min,
for both, hosts and services. I´ve set this in the templates.cfg file (see
below). I sending too the nagiostats output. I agree with you that if we
divide 100 k checks / 15 min ~ 111 checks/sec, but the problem is that
Nagios does not make these checks smoothly during the time. Thats the
problem.
==========
templates.cfg
==========
define host{
name generic-host
...
check_interval 15
....
}
define service{
name generic-service
...
normal_check_interval 15
....
}
==============
nagiostats output
==============
Nagios Stats 3.2.3
Copyright (c) 2003-2008 Ethan Galstad (www.nagios.org)
Last Modified: 10-03-2010
License: GPL
CURRENT STATUS DATA
------------------------------------------------------
Status File: /usr/local/nagios/var/status.dat
Status File Age: 0d 0h 0m 17s
Status File Version: 3.2.3
Program Running Time: 0d 17h 43m 2s
Nagios PID: 18854
Used/High/Total Command Buffers: 0 / 0 / 4096
Total Services: 68206
Services Checked: 68206
Services Scheduled: 68206
Services Actively Checked: 68206
Services Passively Checked: 0
Total Service State Change: 0.000 / 43.880 / 2.774 %
Active Service Latency: 40.671 / 503.137 / 234.919 sec
Active Service Execution Time: 0.003 / 24.737 / 2.527 sec
Active Service State Change: 0.000 / 43.880 / 2.774 %
Active Services Last 1/5/15/60 min: 0 / 2897 / 35932 / 68206
Passive Service Latency: 0.000 / 0.000 / 0.000 sec
Passive Service State Change: 0.000 / 0.000 / 0.000 %
Passive Services Last 1/5/15/60 min: 0 / 0 / 0 / 0
Services Ok/Warn/Unk/Crit: 46943 / 56 / 7660 / 13547
Services Flapping: 980
Services In Downtime: 0
Total Hosts: 34103
Hosts Checked: 34103
Hosts Scheduled: 34103
Hosts Actively Checked: 34103
Host Passively Checked: 0
Total Host State Change: 0.000 / 63.820 / 2.598 %
Active Host Latency: 0.000 / 474.337 / 247.944 sec
Active Host Execution Time: 0.000 / 20.354 / 2.033 sec
Active Host State Change: 0.000 / 63.820 / 2.598 %
Active Hosts Last 1/5/15/60 min: 0 / 5936 / 29437 / 34103
Passive Host Latency: 0.000 / 0.000 / 0.000 sec
Passive Host State Change: 0.000 / 0.000 / 0.000 %
Passive Hosts Last 1/5/15/60 min: 0 / 0 / 0 / 0
Hosts Up/Down/Unreach: 23591 / 10512 / 0
Hosts Flapping: 597
Hosts In Downtime: 0
Active Host Checks Last 1/5/15 min: 3 / 89 / 209
Scheduled: 0 / 0 / 0
On-demand: 3 / 89 / 209
Parallel: 0 / 0 / 0
Serial: 0 / 0 / 0
Cached: 3 / 89 / 209
Passive Host Checks Last 1/5/15 min: 0 / 0 / 0
Active Service Checks Last 1/5/15 min: 0 / 0 / 0
Scheduled: 0 / 0 / 0
On-demand: 0 / 0 / 0
Cached: 0 / 0 / 0
Passive Service Checks Last 1/5/15 min: 0 / 0 / 0
External Commands Last 1/5/15 min: 0 / 0 / 0
On Tue, Aug 23, 2011 at 6:14 PM, Sven Nierlein <Sven.Nierlein at consol.de>wrote:
> On 8/23/11 22:21, Rodney Ramos wrote:
> > When I´ve changed the max_concurrent_checks from "0" to "200", nagios
> process fell down to 30/50%. However, the latency increased a lot, going to
> more then 1000 sec!!
>
> Which means you have usually more than 200 concurrent checks. Maybe
> 400-500. When i compare that to your inital mail, writing about 60k services
> + 30k hosts in a 15min interval i get only 100checks / second. Are you sure
> about the 15min interval? How many checks do you have per second? Did you
> change you interval_length?
>
> Sven
>
>
> ------------------------------------------------------------------------------
> EMC VNX: the world's simplest storage, starting under $10K
> The only unified storage solution that offers unified management
> Up to 160% more powerful than alternatives and 25% more efficient.
> Guaranteed. http://p.sf.net/sfu/emc-vnx-dev2dev
> _______________________________________________
> Nagios-devel mailing list
> Nagios-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20110824/ef5f92f6/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
EMC VNX: the world's simplest storage, starting under $10K
The only unified storage solution that offers unified management
Up to 160% more powerful than alternatives and 25% more efficient.
Guaranteed. http://p.sf.net/sfu/emc-vnx-dev2dev
-------------- next part --------------
_______________________________________________
Nagios-devel mailing list
Nagios-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-devel
More information about the Developers
mailing list