Host and Services update fonction called twice
Matthieu Kermagoret
mkermagoret at merethis.com
Thu May 14 17:52:21 CEST 2009
Hello,
I'm Matthieu and I work with Julien, here at Merethis. I see that
there's a bit of a misunderstanding so I'll try to clarify and explain
what we believe to be a bug in Nagios. All code, dumps and
explanations below are extracted from the latest CVS revision of
Nagios.
On Thu, May 14, 2009 at 1:44 PM, Andreas Ericsson <ae at op5.se> wrote:
> The figures you posted are really just crap to me as I have no idea what
> the different figures are suppose to mean.
>
Those are just plain text dump of what ndomod sends to ndo2db. The
format is really simple. Just notice that each "paragraph" is a
different event that will generate a DB query (ie. if you have twice
the same paragraph in a row, you'll execute the same query twice on
the DB).
> A hook such as the one below would let you debug this
> properly:
>
> [...]
> if (ds->type != NEBTYPE_SERVICE_CHECK_PROCESSED) {
> return 0;
> }
> [...]
>
That's what tipped me off. In fact we weren't talking about
SERVICE_CHECK events but about SERVICE_STATUS events ! So I guess your
explanations about DNX support code is off the table... Right ?
Now that we're clear, here are my first investigations.
It seems that for each service status update on Nagios, the
update_service_status() function from common/statusdata.c is called
twice. This function generates a NEBTYPE_SERVICESTATUS_UPDATE event
each time it's called. Below is what I believe to be the offending
code from base/checks.c :
<code>
881 int handle_async_service_check_result(service *temp_service,
check_result *queued_check_result){
[...]
1560 /* schedule a non-forced check if we can */
1561 if(temp_service->should_be_scheduled==TRUE)
1562 schedule_service_check(temp_service,temp_service->next_check,CHECK_OPTION_NONE);
[...] /* No modification of temp_service in between. */
1590 update_service_status(temp_service,FALSE);
</code>
Here's what to notice is :
- the call to schedule_service_check() with temp_service
- the call to update_service_status() below with no modification of
temp_service
<code>
1634 void schedule_service_check(service *svc, time_t check_time, int options){
[...]
1764 /* update the status log */
1765 update_service_status(svc,FALSE);
</code>
Unfortunately, when trying to schedule the next service check, it is
possible that the temp_service object is reused, just updated on the
next service check time. So the event could be broadcasted a first
time in schedule_service_check() and a second time in
handle_async_service_check_result().
So what do you think about it ? I'm new to Nagios code so I might be mistaken.
Best regards,
--
Matthieu KERMAGORET | Développeur
mkermagoret at merethis.com
MERETHIS est éditeur du logiciel Centreon.
------------------------------------------------------------------------------
The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your
production scanning environment may not be a perfect world - but thanks to
Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700
Series Scanner you'll get full speed at 300 dpi even with all image
processing features enabled. http://p.sf.net/sfu/kodak-com
More information about the Developers
mailing list