server profiling options

Daniel Wittenberg daniel.wittenberg.r0ko at statefarm.com
Fri Nov 19 20:55:48 CET 2010


I'm a bit suspicious about the external_command_check_interval
directive ( see http://nagios.sourceforge.net/docs/3_0/configmain.html
)

If it's set to "-1" (as mine was until recently) then  Nagios will
check external commands as often as possible.  I suspect it helps if
you set it to a definite interval, for example 15s, but check
nagiostats to make sure your command buffers don't fill up.


-- This is something I was going to try as well


IME Nagios itself is usually quite light on CPU.  It's the plugins and
how frequently they run which affect performance the most.   I always
set check_interval and retry_interval as long as possible in service
definitions to spread the load as much as possible.

Some plugins can be real performance hogs too, especially
check_esx3.pl if you use that (I don't mean to dis' it, as it's a
super plugin - it just eats cpu).  Run 'top' and you will probably see
which plugins are the biggest hogs on your system.

- Right now the plugins aren't causing much of an issues, it's the core
nagios engine
Causing issues...

ndo (the interface with MySQL if you have that installed) can be a
real performance hog.  That's a whole other topic!

- Luckily no not right now...

If you're using pnp4nagios for graphing performance data, consider
setting it up in bulk mode, ideally on a separate server.  It won't
make a huge difference but might help a bit.

- Already done

If it's more important to you to stop Nagios hammering your server
than it is for Nagios to work right, you can use max_concurrent_checks
to limit the number of checks Nagios can run at any time.  Keep an eye
on your service check latency if you do that though - if latency gets
too high (more than a minute or so) you will find Nagios' usefulness
diminish quite rapidly!  Personally I think you should give Nagios a
dedicated server and let it use as much CPU as it needs.

Oh, and v3.1.3 includes a fix which improves performance of the status
cgis.  I'm looking forward to trying that myself next week.

Ah, yes, if you have quite a few users, consider setting
"refresh_rate" in cgi.cfg to a longer time, otherwise everyone who
leaves a status screen open in their browser will hit your Nagios
server every 90 seconds (or whatever value it's set to on your
system).   If I recall I set mine to 180.

- Luckily almost no users with this, we've been using multisite since we
have
A number of servers

I'm not sure if any of this will help you, but hopefully it will give
you an idea or two.

Cheers,

Jim

- Yeah, any other thoughts people have is helpful.  I also tried using
the new:
auto_reschedule_checks=1
auto_rescheduling_interval=30
auto_rescheduling_window=180

Options this morning and with the default it seems to actually be
helping...
One other thing I'll probably do is strip out some macros to reduce
processing overhead for things that
Are static and don't need to be checked every time, like path to
plugins, and such.

Dan

------------------------------------------------------------------------------
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today
http://p.sf.net/sfu/msIE9-sfdev2dev
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list