Nagios-devel digest, Vol 1 #1058 - 2 msgs
Andreas Ericsson
ae at op5.se
Fri Apr 21 12:20:35 CEST 2006
Please refrain from top-posting. It makes following the discussion a lot
harder.
Vegard Hanssen wrote:
> There is a problem with defining check-host-alive to check_nrpe.
>
> A normal problem in my setup:
>
> One host get a too high load. This will time out the nrpe checks, giving
> me 5-10 sms for the timeout, and then (2 minutes later) the same sms for
> OK. The host isn't down, it's just in a very busy state, and I know this
> since I don't get a host down message. If I change the check-host-alive
> to nrpe I will then get a host down message, which can mean anything
> from high load, host is unreachable or host is actually down. Host down
> = I have to drop everything and get to work, High Load = let's give it a
> minute to cool down first.
>
> I could do the same as Øysten Bleie suggest, but I'm not sure that's
> good either. Actually I'm not sure what's best to do, so I've for the
> moment stuck with all the sms.
>
You could bite the bullet and set up the service dependencies. You want
Nagios to automagically understand that there's a relationship between
the nrpe-based services and the nrpe-service itself, but since Nagios
has no understanding of what checks you're running this desire is quite
complex to implement. Nagios lets you do the part requiring accuracy
(namely the "what needs what" part) and then handles the rest for you.
With some clever scripting and a generic naming convention I'm sure
you'll be able to set everything up in half an hour, saving you quite a
few notifications on failures.
--
Andreas Ericsson andreas.ericsson at op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
More information about the Developers
mailing list