NRPE way too fragile ?
Guillaume Rousse
Guillaume.Rousse at inria.fr
Wed Oct 8 12:43:39 CEST 2008
Hello list.
I'm using nrpe quite heavily for testing lots of local service on all my
machines. It work usually well, but seems a bit unreliable: too much
often, nrpe itself fails to accept incoming connections, and test fails:
CHECK_NRPE: Socket timeout after 10 seconds.
stracing nrpe process shows it is probably waiting itself on another
connection:
[root at denfert ~]# strace -p 22444
Process 22444 attached - interrupt to quit
select(6, [5], NULL, [5], {0, 170000}) = 0 (Timeout)
accept(5, 0, NULL) = -1 EAGAIN (Resource
temporarily unavailable)
It usually recovers itself alone, but that's enough to cause much
unwanted notifications, even if all monitored services have nrpe itself
as dependency. I'm using ssl encryption, as usually advised, but I'm
planning shifting to plain-text connection (everything occurs on a
distinc VLAN, without user access).
Does everyone else has similar experience ?
--
Guillaume Rousse
Moyens Informatiques - INRIA Futurs
Tel: 01 69 35 69 62
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list