Performance issues, too
Andreas Ericsson
ae at op5.se
Wed Jan 3 11:53:58 CET 2007
Robert Hajime Lanning wrote:
> I have also been having performance issues with Nagios 2.5 on
> a Sun E220R with two 400MHz procs and 1GB ram.
>
> Sys stats are at http://lanning.cc/kipper.html
>
> The large dips in load and system CPU time are when I restart
> Nagios. (cron'd twice a week, but I have also been making
> a lot of service updates lately, hence the almost once a day
> restarts.) For the restarts to fix the latency, I have
> "use_retained_scheduling_info=0".
>
> After about three days the Service Check latency will grow
> to over 300 seconds. It is usually steady at around 0-5
> seconds, for a couple of days, then it will rise over the
> course of a few hours to over the 300 second mark.
>
This is a bit bizarre and simply must be related to something else. Does
Nagios run out of commandbuffer slots? Aren't they freed properly?
>
> I have noticed the Nagios seems to have a memory leak. As,
> I have watched over the last hour the process grow from 124M
> to 126M.
>
This can probably be attributed to the fact that Nagios fork()'s, then
frees and allocates memory before running execve() in a thread. This
isn't per se prohibited, but strongly discouraged. I wouldn't be
surprised to find that other applications that do the same thing will
leak memory on Sun. On Linux, threads are created in a 1-1 fashion
(meaning each thread is actually its own process). This holds true for
some other systems as well, and afaik there are 1-1 thread
implementations for Sun as well. In any case, the 1-1 thing means that
the kernel cleans up any left-over memory for the processes when they
exit, which isn't necessarily the case in a 1-many relationship thread
implementation. Possibly worth investigating.
> I use ePN with caching. Most of my checks are SNMP requests
> via ePN scripts (http://lanning.cc/custom_plugins/), with
> p1.pl modified with:
>
> use SNMP 5.0;
> SNMP::loadModules("ALL");
>
Forgive a novice, but doesn't this make it load all SNMP submodules each
time it runs a perl-module? That would certainly be a major impact on
load and could well lead to memory leaks (assuming the submodules aren't
always freed after having been loaded).
> We have put into our budget to move Nagios to a Linux/Intel
> server. But, what bugs me is the high CPU time in kernel
> space, because of Nagios.
>
Again, this is a behaviour not regularly experienced on Linux (which is
the base for most Nagios installations). Linux is simply very, very good
at fork(). It doesn't do bother even trying to do other things properly
(like 1-many threading), simply because it's so damn good at forking. It
would be interesting to see if your problems go away when you move to
Linux. I'm not saying it's superior to Solaris, but afaiu, Ethan runs
all his tests on Linux and would certainly have found bugs of this kind
if they had bitten him.
--
Andreas Ericsson andreas.ericsson at op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list