Nagios scheduling question
Ethan Galstad
nagios at nagios.org
Thu Mar 29 18:21:11 CEST 2007
One other note - make sure you are running regularly scheduled checks of
hosts in your Nagios 2.x setup, as this can cause backlog problems in
regards to scheduling. This has been remedied in the Nagios 3 code, but
its still alpha...
william(at)elan.net wrote:
> On Wed, 28 Mar 2007, Ethan Galstad wrote:
>
>> Make sure the max_concurrent_checks is high enough. Running nagios with
>> the -s option should throw a warning if it isn't.
>
> Its set to 0 so should be unlimited. This was the first thing I checked.
>
>> You can also compile the Nagios daemon with some debugging option for
>> printing information on scheduled tasks. Run the configure script like
>> such:
>>
>> ./configure --enable-DEBUG3
>>
>> The run Nagios as a foreground process and pipe the output to a file
>> that you can examine for potential problem messages.
>
> I compiled new binary for 2.8 and one with debug options. I have to
> go through certain procedure of testing binaries outside production
> environment before I can try it. Next time I'll be there to deal with
> it is Tuesday so I'll know more then. I'm also trying to convince them
> to try 3.0 as it has number of improvements for larger operations and
> no parallellism limitations for host checks.
>
>> william(at)elan.net wrote:
>>> Thanks for the pointer, I heard about DNX but did not look at it closely
>>> yet, I'll take a look. However I've bad feeling the company I'm setting
>>> it for would not like it because its listed as "alpha" software; also
>>> DNX is more for "distributed" monitoring where as there everything runs
>>> on same box.
>>>
>>> I'm more interested in learning how nagios decides how many processes
>>> it can run based on system load so as to try to tune it to have more
>>> processes & service checks done simultaneously.
>>>
>>> On Tue, 27 Mar 2007 bobi at netshel.net wrote:
>>>
>>>> Have you checked out the Distributed Nagios eXecutive (DNX) at Source Forge?
>>>>
>>>> The purpose of this project was to increase service check capacity and
>>>> throughput by creating a multi-threaded and distributed service check
>>>> architecture around Nagios (it's based on Nagios 2.7)
>>>>
>>>> Bob
>>>>
>>>>> I have an issue with one of the client nagios installations where
>>>>> nagios is executing checks too rarely and all the options to tune it
>>>>> I've tried did not help. Currently they have 2500 services on about
>>>>> 120 hosts and nagios seems to execute checks about every 8-9 minutes
>>>>> where as what is needed is about every 3-4 minutes. I've tried manual
>>>>> tuning with setting 'service_inter_check_delay_method' (I set it to
>>>>> 0.05 which is even more aggressively then needed, but it did cause
>>>>> slight improvement over 's') and 'service_interleave_factor' (tried
>>>>> setting it to '1' and '2' but results were worse). Now as far as I
>>>>> can tell the issue is not scheduling (which nagios does correctly
>>>>> within range I want) but time of service check execution which is on
>>>>> average 1.5 seconds and nagios does not want to run more concurrent
>>>>> processes.
>>>>>
>>>>> Now the question I have is how to best deal and tune it both using
>>>>> current config options and assuming that if I'm pointed to right
>>>>> direction that I'd be willing to look at source code and see if
>>>>> it can be improved in some way.
>>>>>
>>>>> On a related note I was looking at the source code and before
>>>>> I always thought nagios was more of multi-threaded application
>>>>> but based on what I can see (utils.c) it does multi-process
>>>>> execution creating new process for each service check (my_system
>>>>> function). Is there any interest in improving it? What I'm
>>>>> particularly interesting is having several worker threads
>>>>> capable of executing embedded perl plugins and without going
>>>>> through creation of new process every time.
>>>>>
>>>>> --
>>>>> William Leibzon
>>>>> Elan Networks
>>>>> william at elan.net
>
Ethan Galstad,
Nagios Developer
---
Email: nagios at nagios.org
Website: http://www.nagios.org
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
More information about the Developers
mailing list