Nagios heavy load check
Stanley Hopcroft
Stanley.Hopcroft at IPAustralia.Gov.AU
Wed Apr 30 07:22:11 CEST 2003
Dear Sir,
I am writing to thank you for your letter and say,
On Tue, Apr 29, 2003 at 12:56:41PM -0400, Patrick LeBoutillier wrote:
> Hi all,
>
> Is it possible for Nagios to send notifications when it starts getting to
> much behind in it's
> checks? We had a previous monitoring tool that had that problem. It would
> get overwhelmed
> with tests (especially when many timeouts occured).
>
> I guess I could use a cron to check the scheduling queue CGI, but maybe
> there is a better way
> (maybe when max_concurrent_checks=20 is reached to often or something like
> that).
If someone hasn't suggested this already, perhaps a cron scheduled check
of the Nagios performance page, checking for the average (and perhaps
the max) check latency.
http://<Your_Nag>/nagios/cgi-bin/extinfo.cgi?&type=4
You will have to deal with the HTML tables (A Perl check could use
HTML::Table or roll your own with HTML::Parser) but this could be
simple-mindedly done by searching in the HTML for 'Check Latency:'.
>
> Thanks,
>
However, when Nag is tuned on appropriately sized hardware, I have found
it performs well and reliably.
The only cases of check latency I have seen (or heard about) involve
. lots of hosts down (eg power, LAN or shared storage failure)
. many many services (>= 1000)
. high check frequencies
. lossy network connections or flapping services necessitating check
retries.
For my employers site, Nag is checking 350 services at 5 minute
intervals or a target rate of 70 checks/minute. If this were higher
(say
100s/min, because either the check interval is small, the checks take
more than 5 minutes to complete or there are more
services to be checked), then this may be a cause of latency.
You should be able to simulate at least part of the load of checking
with a fake check that waits as long as your longest check and a
simple driver that forks your target check number and execs the fake
check (Nag does more than this of course, but this will give you a
lower bound on your resources budget)
Yours sincerely.
--
------------------------------------------------------------------------
Stanley Hopcroft
------------------------------------------------------------------------
'...No man is an island, entire of itself; every man is a piece of the
continent, a part of the main. If a clod be washed away by the sea,
Europe is the less, as well as if a promontory were, as well as if a
manor of thy friend's or of thine own were. Any man's death diminishes
me, because I am involved in mankind; and therefore never send to know
for whom the bell tolls; it tolls for thee...'
from Meditation 17, J Donne.
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list