[PATCH] checks: Set check state to UNKNOWN when there's a timeout
Andreas Ericsson
ae at op5.se
Mon Aug 19 16:16:30 CEST 2013
On 2013-08-19 14:59, Robin Sonefors wrote:
> When a check - host or service - fails to exit properly, it always
> becomes CRITICAL. When an active host check times out, it goes into
> UNKNOWN. When a service check times out, nothing is done to the state,
> which led to my system telling me that a check that timed out was an OK
> check.
>
> This thus sets the state to UNKNOWN when the check didn't exit in time,
> because that seems to make more sense and is analogous with what's done
> for host checks.
>
> Still, the whole CRITICAL vs UNKNOWN descibed above makes me a bit less
> confident in my fix than I'd like - does this need more work?
>
Well, it needs to honor "service_check_timeout_state", which is a global
variable. Apart from that, it might be nice to save the output, if
there is any such, or create the output in base/workers.c where we
handle all non-ok helper exit codes anyway. I think the check for jobs
of type WPJOB_CHECK already has their own case label, so the effort
shouldn't be huge.
Saving output will only be useful in very (very) rare cases though, so
concatenating it with the non-null plugin_output should work ok.
--
Andreas Ericsson andreas.ericsson at op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.
------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite!
It's a free troubleshooting tool designed for production.
Get down to code-level detail for bottlenecks, with <2% overhead.
Download for free and get started troubleshooting in minutes.
http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
More information about the Developers
mailing list