high Service Check Latency

Rick Garland Rick.Garland at quantum.com
Fri Oct 28 18:46:09 CEST 2011


Hi all:

 

Dell PE2950, 16GB ram, plenty of disk space, etc

Just upgraded to Nagios 3.3.1 from Nagios 3.2.3

MySQL 5.0.77

NDO2DB 1.4b9

RRDTool 1.4.5

NRPE 2.8.1

 

Been using nagios for a while (nagios 2.x) and I have been upgrading,
the latest upgrade from 3.2.3. Every other upgrade has gone off without
problems except this one to Nagios 3.3.1.

 

The Service Check Latency has jumped from being about 1 sec to 200+
seconds. I have searched for tuning tips and have made the following
changes in nagios.cfg but with little effect.

max_concurrent_checks=100

check_result_reaper_frequency=15

max_check_result_reaper_time=25

 

The output below is the result of nagiosstats.

Nagios Stats 3.3.1

Copyright (c) 2003-2008 Ethan Galstad (http://www.nagios.org)

Last Modified: 07-25-2011

License: GPL

 

CURRENT STATUS DATA

------------------------------------------------------

Status File: /usr/local/nagios/var/status.dat

Status File Age: 0d 0h 0m 11s

Status File Version: 3.3.1

 

Program Running Time: 0d 16h 27m 5s

Nagios PID: 6224

Used/High/Total Command Buffers: 0 / 3 / 4096

 

Total Services: 2023

Services Checked: 2023

Services Scheduled: 2020

Services Actively Checked: 2023

Services Passively Checked: 0

Total Service State Change: 0.000 / 9.870 / 0.020 %

Active Service Latency: 0.008 / 324.337 / 242.021 sec

Active Service Execution Time: 0.011 / 52.108 / 0.874 sec

Active Service State Change: 0.000 / 9.870 / 0.020 %

Active Services Last 1/5/15/60 min: 128 / 902 / 1850 / 1935

Passive Service Latency: 0.000 / 0.000 / 0.000 sec

Passive Service State Change: 0.000 / 0.000 / 0.000 %

Passive Services Last 1/5/15/60 min: 0 / 0 / 0 / 0

Services Ok/Warn/Unk/Crit: 2021 / 1 / 0 / 1

Services Flapping: 0

Services In Downtime: 0

 

Total Hosts: 152

Hosts Checked: 152

Hosts Scheduled: 28

Hosts Actively Checked: 152

Host Passively Checked: 0

Total Host State Change: 0.000 / 0.000 / 0.000 %

Active Host Latency: 0.000 / 471.193 / 284.723 sec

Active Host Execution Time: 0.007 / 0.162 / 0.044 sec

Active Host State Change: 0.000 / 0.000 / 0.000 %

Active Hosts Last 1/5/15/60 min: 4 / 16 / 29 / 29

Passive Host Latency: 0.000 / 0.000 / 0.000 sec

Passive Host State Change: 0.000 / 0.000 / 0.000 %

Passive Hosts Last 1/5/15/60 min: 0 / 0 / 0 / 0

Hosts Up/Down/Unreach: 152 / 0 / 0

Hosts Flapping: 0

Hosts In Downtime: 0

 

Active Host Checks Last 1/5/15 min: 5 / 17 / 49

Scheduled: 5 / 16 / 44

On-demand: 0 / 1 / 5

Parallel: 5 / 16 / 46

Serial: 0 / 0 / 0

Cached: 0 / 1 / 3

Passive Host Checks Last 1/5/15 min: 0 / 0 / 0

Active Service Checks Last 1/5/15 min: 179 / 939 / 2898

Scheduled: 179 / 939 / 2898

On-demand: 0 / 0 / 0

Cached: 0 / 0 / 0

Passive Service Checks Last 1/5/15 min: 0 / 0 / 0

 

External Commands Last 1/5/15 min: 0 / 0 / 0

 

Something I did find, don't know if it's related or not - yet.

Before the upgrade the CPU sys value would stay in the 3% range. Since
the upgrade the CPU sys is running in the 20% range.

I also see the run queue jump up by a factor of 5x at times.

 

 

I have been unable to find any reasons why or solutions. Anybody else?

 

Thanks

 

Rick Garland | Sr UNIX Systems Administrator | Quantum, Corp | Office:
720-249-5984 | cell: 720-210-4671

 

----------------------------------------------------------------------
The information contained in this transmission may be confidential. Any disclosure, copying, or further distribution of confidential information is not permitted unless such privilege is explicitly granted in writing by Quantum. Quantum reserves the right to have electronic communications, including email and attachments, sent across its networks filtered through anti virus and spam software programs and retain such messages in order to comply with applicable data security and retention requirements. Quantum is not responsible for the proper and complete transmission of the substance of this communication or for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20111028/98747240/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
The demand for IT networking professionals continues to grow, and the
demand for specialized networking skills is growing even more rapidly.
Take a complimentary Learning at Cisco Self-Assessment and learn 
about Cisco certifications, training, and career opportunities. 
http://p.sf.net/sfu/cisco-dev2dev
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


More information about the Users mailing list