Rampant Nagios
admin at jpk236.com
admin at jpk236.com
Tue May 17 17:29:18 CEST 2005
Thank you for your very informative answer. I'd like to know if perhaps
it would change if I had been more specific regarding my setup.
2 Central Fail-Over servers
8 Distributed Monitoring Servers
I use NSCA to do the distributed monitoring. The only servers I've seen
the CPU usage on is the distributed monitoring servers, which have no
children. I would completely agree with your answer had the problem
been with a central server, but the fact that it's with a distributed
server still leaves me confused.
- Justin Kulikowski
[ http://www.jpk236.com ]
Andreas Ericsson wrote:
> admin at jpk236.com wrote:
>
>> Nagios v2.03b3
>> FreeBSD 5.4-RELEASE-p1
>>
>> On some of the hosts I monitor I've been noticing some peculiarities.
>> Nagios will spontaneously become a CPU hog -- using an upwards of 80-90%
>> CPU, sometimes higher.
>>
>
> This is probably when a host with children has gone down. Nagios will
> force a check of all hosts and services beyond and "inside" the down
> host. This eats a lot of CPU, obviously.
>
>> I try stopping nagios using FreeBSD's rc.d script for nagios. The
>> output claims nagios has stopped, but when I run `ps auxwww` there is
>> still an instance of nagios running. I can only assume the rc.d script
>> was able to remove the lock file, but was not able to stop the process.
>>
>
> This is because one of the threads (service_result_worker_thread) sets
> its cancel state to deferred and then goes off to do some
> uninterruptable IO. It will die eventually, but not straight away
> (unless it's just reached the pthread_mutex_unlock() at the end of its
> base function).
>
>> Has anyone else experienced this behavior?
>>
>
> It is present in all nagios since 2.01b. If it's still running after 10
> seconds, you've got a real runaway.
>
>> - Justin Kulikowski
>> [ http://www.jpk236.com ]
>>
>>
>>
>> -------------------------------------------------------
>> This SF.Net email is sponsored by Oracle Space Sweepstakes
>> Want to be the first software developer in space?
>> Enter now for the Oracle Space Sweepstakes!
>> http://ads.osdn.com/?ad_id=7412&alloc_id=16344&op=click
>> _______________________________________________
>> Nagios-users mailing list
>> Nagios-users at lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nagios-users
>> ::: Please include Nagios version, plugin version (-v) and OS when
>> reporting any issue. ::: Messages without supporting info will risk
>> being sent to /dev/null
>>
>
-------------------------------------------------------
This SF.Net email is sponsored by Oracle Space Sweepstakes
Want to be the first software developer in space?
Enter now for the Oracle Space Sweepstakes!
http://ads.osdn.com/?ad_id=7412&alloc_id=16344&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list