RFC/PATCH: Handle external service check results in seperate thread

Thomas Guyot-Sionnest Thomas at zango.com
Fri Apr 13 20:36:47 CEST 2007


> -----Original Message-----
> From: nagios-devel-bounces at lists.sourceforge.net 
> [mailto:nagios-devel-bounces at lists.sourceforge.net] On Behalf 
> Of Ethan Galstad
> Sent: April 13, 2007 7:03
> To: Nagios Developers List
> Subject: Re: [Nagios-devel] RFC/PATCH: Handle external 
> service check results in seperate thread
> 
> Stefan Rompf wrote:
> > Hi,
> > 
> > like other people on this list, we've been bitten by the 
> problem that nagios 
> > fork()s subprocesses when service check results arrive via 
> the external 
> > command pipe. When nagios lags for example due to 
> hostchecks, in most cases 
> > enough forked processes pile up to bring nagios over its 
> resource limits. 
> > Even if this doesn't happen, results will be fed in the wrong order.
> > 
> > I've developed the following solution that is quite 
> different to the spool 
> > directory approach:
> > 
> > -passive service check results are added to 
> passive_check_result_list as 
> > before. However, for our use case it does not make sense to 
> keep multiple 
> > results for one service as soon as nagios starts lagging. 
> So we have a 
> > duplicate detection that keeps only the newest check result 
> per service.
> > -Instead of forking subprocesses, a permanently running 
> thread feeds the 
> > results on passive_check_result_list back via 
> write_svc_message(). So two 
> > threads of the process talk to each other via a pipe, but I 
> didn't want to 
> > make my changes too invasive ;-)
> > -Instead of polling the command pipe every 0.5 seconds, 
> select() on the file 
> > descriptor is used now if there are enough 
> external_command_buffer_slots. 
> > Problem here was that with no writer on the pipe, select() 
> endlessly signaled 
> > an EOF. Fixed by opening the command pipe R/W.
> > 
> > The patch has been developed on nagios 2.6 and linux, 
> afterwards forward 
> > ported to current CVS. It seems to work, but needs further 
> testing. Even 
> > compilation tests on different architectures would be 
> interesting, I'm not 
> > sure how widespread the tsearch()-API is.
> > 
> > Thoughts?
> > 
> > Stefan
> 
> Sounds interesting.  I'm still leaning towards the spool 
> directory idea, 
> as it provides from resistance to problems when Nagios isn't running 
> and/or the external command file pipe fills up.

No matter what you do you can still change to select on the external command
pipe by oppening it RW. This is what I do in the OCP_daemon.

Just my 2 cents...

Thomas
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 3076 bytes
Desc: not available
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20070413/8343d98d/attachment.bin>
-------------- next part --------------
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
-------------- next part --------------
_______________________________________________
Nagios-devel mailing list
Nagios-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-devel


More information about the Developers mailing list