NRPE 2.0 + static nat = trouble
Andreas Ericsson
ae at op5.se
Thu Jun 17 13:25:32 CEST 2004
Frederic Vanden Poel wrote:
> We are using nrpe 2.0 on all our Linux servers.
>
> Towards the internal machines, nrpe runs fine and we never get bogus
> alerts.
>
> On the DMZ machines, nrpe daemons started from xinetd sometimes remain
> in an ESTABLISHED state (netstat) and strace shows nrpe is waiting
> during a read() syscall. The nrpe setup in xinetd is pretty standard :
>
What system are you running (Linux, BSD, Solaris, ...)?
> service nrpe
> {
> disable = no
> flags = REUSE
> socket_type = stream
> wait = no
> user = nagios
> server = /usr/local/nagios/nrpe
> server_args = -i -c /etc/nrpe.cfg
> only_from = 127.0.0.1 aaa.bbb.ccc.ddd
> log_on_failure += USERID
> }
>
> After a couple of hours there are so many nrpe processes in this state
> that new connections from the nagios server result in the following
> error :
>
> [06-08-2004 16:14:49] SERVICE ALERT: svr.dmz.com;/var
> disk;CRITICAL;HARD;4;CHECK_NRPE: Error - Could not complete SSL
> handshake.
>
> which is quite annoying as we are notified with bogus alerts for the
> services checked through nrpe.
>
> The nagios server has a one-to-one static NAT address aaa.bbb.ccc.ddd in
> the DMZ range, which means that the other DMZ machines see the nagios
> probes coming from within the DMZ.
>
> We also tried to run nrpe as a standalone daemon without xinetd and the
> problem remains. After a while, the nrpe daemon just gives up without a
> syslog message. The nrpe problems occurs on all our Linux DMZ servers.
>
> We have tried to change connection timeouts on the firewall and through
> nagios but none of them seem to help. We have a less than ideal
> workaround by killing all hanging nrpe processes from cron.
>
> Any ideas to debug the issue are very welcome.
>
You can set debug=1 in your nrpe.conf-file, and if you're really serious
about it you can enable debugging at compile-time.
Do you think it would be possible to run nrpe with debug=1 for a while
and then send me the relevant syslog entries?
--
Sourcerer / Andreas Ericsson
NRPE Maintainer
OP5 AB
+46 (0)733 709032
andreas.ericsson at op5.se
-------------------------------------------------------
This SF.Net email is sponsored by The 2004 JavaOne(SM) Conference
Learn from the experts at JavaOne(SM), Sun's Worldwide Java Developer
Conference, June 28 - July 1 at the Moscone Center in San Francisco, CA
REGISTER AND SAVE! http://java.sun.com/javaone/sf Priority Code NWMGYKND
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list