Race condition in freshness checking
Ethan Galstad
nagios at nagios.org
Thu Sep 27 17:16:29 CEST 2007
Ton Voon wrote:
> Hi!
>
> We found a bug in the calculation of the latency for a passive check.
> This has highlighted a possible race condition re: freshness checking.
> We wanted to get some ideas on what is the best approach to fix this.
>
> Background:
>
> We have a master/slave arrangement, with freshness checking
> (freshness_threshold=0) of slave services on the master.
>
> Looking in the NDO db, we realised that the latency values for passive
> results were incorrectly calculate - sometimes latency values could be
> 1000x out. The patch is attached. However, since using this patch, we've
> seen occasional race conditions.
>
> Problem:
>
> Within checks.c:check_service_result_freshness, if a service has past
> its expiration_time, it is marked as is_being_freshened and a forced
> service check is scheduled. However, if a passive result for this
> service is processed before this forced check is run, then the service
> is marked as stale and the state is inconsistent between master and slave.
>
> Possible solutions:
>
> - If a check result is processed with is_being_freshened set for the
> service, then remove forced check from schedule if it exists.
> - Change is_being_freshened to stale_time (0 if not stale). On running
> the forced check, if stale_time is less than last_check_time (+
> latency?), break out of running the forced check.
>
> None of these sound particularly appealing to us. Are there other
> possible solutions? Any opinions?
>
> Ton
>
I think this race condition was brought up once before on the list, so
I'll take a look at what can be done. I think a reasonable solution can
be found to work for Nagios 3, but backporting it to Nagios 2 will be
more challenging due to the different check result IPC.
Ethan Galstad,
Nagios Developer
---
Email: nagios at nagios.org
Website: http://www.nagios.org
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
More information about the Developers
mailing list