Nagios scheduling question
Ethan Galstad
nagios at nagios.org
Wed Mar 28 16:44:27 CEST 2007
Make sure the max_concurrent_checks is high enough. Running nagios with
the -s option should throw a warning if it isn't.
You can also compile the Nagios daemon with some debugging option for
printing information on scheduled tasks. Run the configure script like
such:
./configure --enable-DEBUG3
The run Nagios as a foreground process and pipe the output to a file
that you can examine for potential problem messages.
william(at)elan.net wrote:
> Thanks for the pointer, I heard about DNX but did not look at it closely
> yet, I'll take a look. However I've bad feeling the company I'm setting
> it for would not like it because its listed as "alpha" software; also
> DNX is more for "distributed" monitoring where as there everything runs
> on same box.
>
> I'm more interested in learning how nagios decides how many processes
> it can run based on system load so as to try to tune it to have more
> processes & service checks done simultaneously.
>
> On Tue, 27 Mar 2007 bobi at netshel.net wrote:
>
>> Have you checked out the Distributed Nagios eXecutive (DNX) at Source Forge?
>>
>> The purpose of this project was to increase service check capacity and
>> throughput by creating a multi-threaded and distributed service check
>> architecture around Nagios (it's based on Nagios 2.7)
>>
>> Bob
>>
>>> I have an issue with one of the client nagios installations where
>>> nagios is executing checks too rarely and all the options to tune it
>>> I've tried did not help. Currently they have 2500 services on about
>>> 120 hosts and nagios seems to execute checks about every 8-9 minutes
>>> where as what is needed is about every 3-4 minutes. I've tried manual
>>> tuning with setting 'service_inter_check_delay_method' (I set it to
>>> 0.05 which is even more aggressively then needed, but it did cause
>>> slight improvement over 's') and 'service_interleave_factor' (tried
>>> setting it to '1' and '2' but results were worse). Now as far as I
>>> can tell the issue is not scheduling (which nagios does correctly
>>> within range I want) but time of service check execution which is on
>>> average 1.5 seconds and nagios does not want to run more concurrent
>>> processes.
>>>
>>> Now the question I have is how to best deal and tune it both using
>>> current config options and assuming that if I'm pointed to right
>>> direction that I'd be willing to look at source code and see if
>>> it can be improved in some way.
>>>
>>> On a related note I was looking at the source code and before
>>> I always thought nagios was more of multi-threaded application
>>> but based on what I can see (utils.c) it does multi-process
>>> execution creating new process for each service check (my_system
>>> function). Is there any interest in improving it? What I'm
>>> particularly interesting is having several worker threads
>>> capable of executing embedded perl plugins and without going
>>> through creation of new process every time.
>>>
>>> --
>>> William Leibzon
>>> Elan Networks
>>> william at elan.net
Ethan Galstad,
Nagios Developer
---
Email: nagios at nagios.org
Website: http://www.nagios.org
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
More information about the Developers
mailing list