Problems with freshness checking

brian.boysen at colinx.com brian.boysen at colinx.com
Thu Jun 10 23:11:28 CEST 2004


Hi, I've just gotten on this news list to investigate a problem I've seen. 
I looked through the archives and someone named Fabio Lo Votrico posted a 
question here about passive service checks indicating stale and Nagios 
"forcing an immediate check",  even though the log showed a 
PROCESS_SERVICE_CHECK_RESULT within the allotted amount of time.
Was this answered off the mailing list (or the answer just didn't make it 
into the archives)? If so where could I find it?

When I've seen it, an external service logged 144 
PROCESS_SERVICE_CHECK_RESULTS with the same timestamp into the log (I'm 
guessing this means that all the results came in on the same processing of 
the external commands file), then about 2 minutes later Nagios entered a 
message indicating the service(s) timed out and it was forcing an active 
check. The active check being a check_dummy!2 would doom this service at 
that point because it's scheduled to fail.

>From what I can see the processing of the service checks into the "same 
queue for active checks" (sqfac) (from docs/passivechecks.html) is forked 
off. The machine is a SUNW Ultra-250. Could the processing into the active 
then  for Nagios to recognize them in the "queue" (sqfac) take 2 minutes?

I changed from the check_dummy!2 command to something that checks around 
for the status two nights ago, but now one service always times out and 
fails 20% of the time.

The command_check_interval is -1.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20040610/370435e3/attachment.html>


More information about the Users mailing list