Rampant Nagios

admin at jpk236.com admin at jpk236.com
Tue May 17 17:29:18 CEST 2005


Thank you for your very informative answer.  I'd like to know if perhaps 
it would change if I had been more specific regarding my setup.

2 Central Fail-Over servers
8 Distributed Monitoring Servers

I use NSCA to do the distributed monitoring.  The only servers I've seen 
the CPU usage on is the distributed monitoring servers, which have no 
children.  I would completely agree with your answer had the problem 
been with a central server, but the fact that it's with a distributed 
server still leaves me confused.

  - Justin Kulikowski
	[ http://www.jpk236.com ]

Andreas Ericsson wrote:
> admin at jpk236.com wrote:
> 
>> Nagios v2.03b3
>> FreeBSD 5.4-RELEASE-p1
>>
>> On some of the hosts I monitor I've been noticing some peculiarities.
>> Nagios will spontaneously become a CPU hog -- using an upwards of 80-90%
>> CPU, sometimes higher.
>>
> 
> This is probably when a host with children has gone down. Nagios will 
> force a check of all hosts and services beyond and "inside" the down 
> host. This eats a lot of CPU, obviously.
> 
>> I try stopping nagios using FreeBSD's rc.d script for nagios.  The
>> output claims nagios has stopped, but when I run `ps auxwww` there is
>> still an instance of nagios running.  I can only assume the rc.d script
>> was able to remove the lock file, but was not able to stop the process.
>>
> 
> This is because one of the threads (service_result_worker_thread) sets 
> its cancel state to deferred and then goes off to do some 
> uninterruptable IO. It will die eventually, but not straight away 
> (unless it's just reached the pthread_mutex_unlock() at the end of its 
> base function).
> 
>> Has anyone else experienced this behavior?
>>
> 
> It is present in all nagios since 2.01b. If it's still running after 10 
> seconds, you've got a real runaway.
> 
>>  - Justin Kulikowski
>>     [ http://www.jpk236.com ]
>>
>>
>>
>> -------------------------------------------------------
>> This SF.Net email is sponsored by Oracle Space Sweepstakes
>> Want to be the first software developer in space?
>> Enter now for the Oracle Space Sweepstakes!
>> http://ads.osdn.com/?ad_id=7412&alloc_id=16344&op=click
>> _______________________________________________
>> Nagios-users mailing list
>> Nagios-users at lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nagios-users
>> ::: Please include Nagios version, plugin version (-v) and OS when 
>> reporting any issue. ::: Messages without supporting info will risk 
>> being sent to /dev/null
>>
> 


-------------------------------------------------------
This SF.Net email is sponsored by Oracle Space Sweepstakes
Want to be the first software developer in space?
Enter now for the Oracle Space Sweepstakes!
http://ads.osdn.com/?ad_id=7412&alloc_id=16344&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list