Max concurrent checks - spreading the next_time
Ton Voon
ton.voon at opsera.com
Thu Jun 11 09:44:47 CEST 2009
On 10 Jun 2009, at 09:52, Andreas Ericsson wrote:
> Ton Voon wrote:
>> I propose that instead of setting next_time = next_time +
>> check_interval, that there is a random factor added, maybe
>> something like:
>>
>> next_time = now + max(5, min(int(rand(15)),
>> int(rand(retry_interval*interval_length))))
>>
>> This means that the next check has been moved at least 5 seconds away
>> from now (to overcome the temporary load due to the number of
>> concurrent
>> service checks), with a maximum of 15 seconds away (or less if the
>> retry_interval is lower).
>>
> I can't help but think that something like this could have been quite
> easily resolved with a round-robin scheduling queue, where items
> requested
> to be queued would simply get inserted within 5 seconds of the
> requested
> time where there are the most free slots. The prng idea will probably
> work just as well though, and I'm fairly certain you could just use
>
> next_time = service->check_interval - 7 + (*service->description &
> 0xf);
>
> to get a distribution almost equally good without having to bother
> about the PRNG-business. This would yield 7 seconds +-, which is
> probably good enough.
I notice that rand() is already used elsewhere in nagios, so I will go
with that instead.
Ton
------------------------------------------------------------------------------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing
server and web deployment.
http://p.sf.net/sfu/businessobjects
More information about the Developers
mailing list