RFC: New IPC Method for Check Results
Ethan Galstad
nagios at nagios.org
Thu Apr 12 06:20:01 CEST 2007
Hendrik Bäcker wrote:
> Ethan Galstad wrote:
>> Proposed solution:
>>
>> The new method I am proposing is simple and straightforward. Why I
>> didn't implement something like this years ago is beyond me. :-)
>>
> Cause, you just wanted to begin your programmers way with a pipe?? *just
> kidding*
>> Instead of passing check results from child processes to the main Nagios
>> process via two methods (pipe and file), I suggest that all information
>> be written to files in a special check result queue directory (e.g.,
>> var/checkresults). Child processes that perform host/service checks can
>> write all results to a file in the queue directory. The main Nagios
>> process will then periodically process all files/check results in the
>> queue in a time-ordered fasion.
>>
> Some of us will remember my post about "a good way to handle performance
> data" with a small discussion about pipes vs. "spooldirs"?!
> In the actual release of the PNP Addon we have established a small
> daemon that does exactly what you wrote above.
> Short excurs: Nagios writes only files with perfdata, rotate them every
> x seconds to a spool dir, daemon reads the files and process them to
> fill the rrdfiles.
> This solution brought me from a latency around 350 Seconds ( ~ 2000
> Serviceechecks) down to 2-5 seconds.
Good to hear that you saw such improvements. Hopefully this will have
similar effects for passive checks...
>
> Cause of this I would say: this is the right way.
>> Any performance hits that may occur with the new IPC method due to disk
>> thrashing can be minimized if the queue directory is placed on a
>> memory-mapped filesystem. Whether this will actually be necessary or
>> not in all but the largest installations remains to be seen.
>>
> I would suggest to keep an eye on the number of files within a
> directory. I know some guys with a huge number of distributed nagios
> servers and a big amount of service checks.
> It might be bad if nagios dies for hours and on re-awakening to process
> thousand of single files if you think of using one file for each result.
I'll make sure that multiple results can be stored in a single file
(ideal for bulk transfers using NSCA). A configurable option will allow
Nagios to process only results made within a certain timeframe. I think
that should take care of it.
>
> Just my 2 Cents.
>
> Kind regards
> Hendrik
>
Ethan Galstad,
Nagios Developer
---
Email: nagios at nagios.org
Website: http://www.nagios.org
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
More information about the Developers
mailing list