3.0b5: External commands are not turned into passive checks after a while
Steffen Poulsen
step at tdc.dk
Sat Oct 13 22:10:18 CEST 2007
Hi,
We are experiencing a problem with nagios commands not being processed
correctly after ~30hours of uptime at our master server. This server
_only_ receives check results through NSCA, it does no active checking.
The server receives 5276 external commands / 5 mins, according to its
performance data. And the first 30 hours of uptime, this is what the
Passive Service Checks stats also reflects.
But at some point the command processing stops, and all the nagios
server sees, is the external command, there is no log line for the
passive check being handled.
A few notes on this condition: Using the "top" utility we notice memory
usage bumps up and down between using practically none and all available
memory (one cycle every second or so):
3169 nagios 1 0 0 631M 559M cpu/15 0:06 4.19% nagios
(Normal scenario (the first 30 hours) is nagios using ~25mb of memory):
4481 nagios 2 0 0 23M 20M cpu/0 1:34 1.60% nagios
When this condition appears, it is not enough to start and stop nagios -
we have to clean out the checkresults directory also.
It contains files like this:
-rw------- 1 nagios nagios 439458 Oct 13 21:34 cywyyBl
(Some are quite large, up to a mb).
Other side effects of this condition:
* Nagios doesn't notice freshness checks that gets stale (it
recognizes stale checks after it starts again
* Nagios doesn't update status.dat (cgis show stale information)
* As checks are not recognized, no performance data and other check
releated stuff is processed
This is a Sun T1000 w. Solaris 10, Nagios 3.0b5 compiled with gcc.
Any ideas appreciated.
Best regards,
Steffen Poulsen
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
More information about the Developers
mailing list