Host and Services update fonction called twice
Andreas Ericsson
ae at op5.se
Thu May 14 20:09:15 CEST 2009
Matthieu Kermagoret wrote:
> Hello,
>
> I'm Matthieu and I work with Julien, here at Merethis. I see that
> there's a bit of a misunderstanding so I'll try to clarify and explain
> what we believe to be a bug in Nagios. All code, dumps and
> explanations below are extracted from the latest CVS revision of
> Nagios.
>
> On Thu, May 14, 2009 at 1:44 PM, Andreas Ericsson <ae at op5.se> wrote:
>> The figures you posted are really just crap to me as I have no idea what
>> the different figures are suppose to mean.
>>
>
> Those are just plain text dump of what ndomod sends to ndo2db. The
> format is really simple. Just notice that each "paragraph" is a
> different event that will generate a DB query (ie. if you have twice
> the same paragraph in a row, you'll execute the same query twice on
> the DB).
>
>> A hook such as the one below would let you debug this
>> properly:
>>
>> [...]
>> if (ds->type != NEBTYPE_SERVICE_CHECK_PROCESSED) {
>> return 0;
>> }
>> [...]
>>
>
> That's what tipped me off. In fact we weren't talking about
> SERVICE_CHECK events but about SERVICE_STATUS events ! So I guess your
> explanations about DNX support code is off the table... Right ?
>
> Now that we're clear, here are my first investigations.
>
> It seems that for each service status update on Nagios, the
> update_service_status() function from common/statusdata.c is called
> twice. This function generates a NEBTYPE_SERVICESTATUS_UPDATE event
> each time it's called. Below is what I believe to be the offending
> code from base/checks.c :
>
Ah, right. Now at least it makes sense :)
> <code>
>
> 881 int handle_async_service_check_result(service *temp_service,
> check_result *queued_check_result){
> [...]
> 1560 /* schedule a non-forced check if we can */
> 1561 if(temp_service->should_be_scheduled==TRUE)
> 1562 schedule_service_check(temp_service,temp_service->next_check,CHECK_OPTION_NONE);
> [...] /* No modification of temp_service in between. */
> 1590 update_service_status(temp_service,FALSE);
>
> </code>
>
> Here's what to notice is :
> - the call to schedule_service_check() with temp_service
> - the call to update_service_status() below with no modification of
> temp_service
>
> <code>
>
> 1634 void schedule_service_check(service *svc, time_t check_time, int options){
> [...]
> 1764 /* update the status log */
> 1765 update_service_status(svc,FALSE);
>
> </code>
>
> Unfortunately, when trying to schedule the next service check, it is
> possible that the temp_service object is reused, just updated on the
> next service check time. So the event could be broadcasted a first
> time in schedule_service_check() and a second time in
> handle_async_service_check_result().
>
> So what do you think about it ? I'm new to Nagios code so I might be mistaken.
>
It seems you're right. I'll have to investigate this more in-depth. I'll file
it in mantis at tracker.nagios.org for now though.
--
Andreas Ericsson andreas.ericsson at op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
Register now for Nordic Meet on Nagios, June 3-4 in Stockholm
http://nordicmeetonnagios.op5.org/
Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.
------------------------------------------------------------------------------
The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your
production scanning environment may not be a perfect world - but thanks to
Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700
Series Scanner you'll get full speed at 300 dpi even with all image
processing features enabled. http://p.sf.net/sfu/kodak-com
More information about the Developers
mailing list