Eternally pending, stale checks

Mike Lindsey mike-nagios at 5dninja.net
Fri Aug 5 00:53:19 CEST 2011


I deployed new monitoring today, and despite a few restarts and many 
hours of waiting, 185/220 services are still pending.

It's a 3.2.1 environment (yes, yes, upgrade, yes) with one master and 
multiple pollers.  All this new monitoring is on one polling host.  
Active checks are disabled on the master, passive checks are submitted 
via NSCA.  Freshness threshold is set to 20 minutes for checks with a 5 
minute interval.

The polling host executes the checks, has the right data in the 
status.log, but the master never receives some of the check data.

The data it does receive is not consistently grouped.  Service A on one 
host will submit consistently, but the same service on a different host 
will fail to submit.  The master will, every 20 minutes throw messages 
about the checks being stale, and needing to force an immediate check, 
but that never seems to make it's way through.

My next step, I suppose will be enabling debug mode on the master, but 
if history is any indication, that will cause the problem to stop 
happening - in addition to it being a pain to parse through debug logs 
for a 10k service environment.  If anyone has ideas on what else to 
check, I'm ears.

-- 
Mike Lindsey


------------------------------------------------------------------------------
BlackBerry® DevCon Americas, Oct. 18-20, San Francisco, CA
The must-attend event for mobile developers. Connect with experts. 
Get tools for creating Super Apps. See the latest technologies.
Sessions, hands-on labs, demos & much more. Register early & save!
http://p.sf.net/sfu/rim-blackberry-1
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list