ndo2db problems on solaris 10 (ndoutils 1.4b7)
Ton Voon
ton.voon at altinity.com
Wed Feb 27 15:10:40 CET 2008
On 27 Feb 2008, at 13:18, Michael Prochaska wrote:
> truss of ndo2db (the -f option follows all children created by
> fork() or
> vfork()):
> root at nagios_1 # truss -f -p 6405
> 6405: accept(5, 0xFFBFF554, 0xFFBFF564, SOV_DEFAULT) (sleeping...)
> 6405: accept(5, 0xFFBFF554, 0xFFBFF564, SOV_DEFAULT) = 6
> 6405: schedctl() = 0xFECA8000
> 6405: fork1() = 6419
[snipped]
>
> 6405: lwp_sigmask(SIG_SETMASK, 0x00000000, 0x00000000) = 0xFFBFFEFF
> [0x0000FFFF]
> 6419: fork1() (returning as child ...) = 6405
> 6419: getpid() = 6419 [6405]
> 6405: close(6) = 0
> 6419: lwp_self() = 1
> 6419: lwp_sigmask(SIG_SETMASK, 0x00000000, 0x00000000) = 0xFFBFFEFF
> [0x0000FFFF]
> 6419: llseek(3, 0, SEEK_CUR) = 0
> 6419: close(3) = 0
> 6419: open("/usr/local/nagios/var/ndo2db.debug",
> O_RDWR|O_APPEND|O_CREAT, 0666) = 3
> 6419: sigaction(SIGQUIT, 0xFFBFED80, 0xFFBFEE20) = 0
> 6419: sigaction(SIGTERM, 0xFFBFED80, 0xFFBFEE20) = 0
> 6419: sigaction(SIGINT, 0xFFBFED80, 0xFFBFEE20) = 0
> 6419: sigaction(SIGSEGV, 0xFFBFED80, 0xFFBFEE20) = 0
> 6419: sigaction(SIGFPE, 0xFFBFED80, 0xFFBFEE20) = 0
> 6419: open("/etc/netconfig", O_RDONLY|O_LARGEFILE) = 7
> 6419: fcntl(7, F_DUPFD, 0x00000100) Err#22 EINVAL
> 6419: read(7, " # p r a g m a i d e n".., 1024) = 1024
> 6419: read(7, " t s t p i _ c".., 1024) = 215
> 6419: read(7, 0x000400E0, 1024) = 0
> 6419: lseek(7, 0, SEEK_SET) = 0
> 6419: read(7, " # p r a g m a i d e n".., 1024) = 1024
> 6419: read(7, " t s t p i _ c".., 1024) = 215
> 6419: read(7, 0x000400E0, 1024) = 0
> 6419: close(7) = 0
> 6419: open("/dev/udp", O_RDONLY) = 7
> 6419: ioctl(7, SIOCGLIFNUM, 0xFFBFEBD4) = 0
> 6419: close(7) = 0
> 6419: getuid() = 100 [100]
> 6419: getuid() = 100 [100]
> 6419: door_info(4, 0xFFBFE8E0) = 0
> 6419: door_call(4, 0xFFBFE988) = 0
> 6419: sigaction(SIGPIPE, 0xFFBFEC40, 0xFFBFECE0) = 0
> 6419: so_socket(PF_INET, SOCK_STREAM, IPPROTO_IP, "", SOV_DEFAULT)
> = 7
> 6419: brk(0x00041AF8) = 0
> 6419: brk(0x00045AF8) = 0
> 6419: fcntl(7, F_SETFL, (no flags)) = 0
> 6419: fcntl(7, F_GETFL) = 2
> 6419: connect(7, 0xFFBFED20, 16, SOV_DEFAULT) = 0
> 6419: setsockopt(7, SOL_SOCKET, SO_RCVTIMEO, 0xFFBFE1B8, 8,
> SOV_DEFAULT)
> Err#99 ENOPROTOOPT
> 6419: setsockopt(7, SOL_SOCKET, SO_SNDTIMEO, 0xFFBFE1B8, 8,
> SOV_DEFAULT)
> Err#99 ENOPROTOOPT
> 6419: brk(0x00045AF8) = 0
> 6419: brk(0x00047AF8) = 0
> 6419: setsockopt(7, ip, 3, 0xFFBFE29C, 4, SOV_DEFAULT) = 0
> 6419: setsockopt(7, tcp, TCP_NODELAY, 0xFFBFE298, 4, SOV_DEFAULT)
> = 0
> 6419: setsockopt(7, SOL_SOCKET, SO_KEEPALIVE, 0xFFBFE30C, 4,
> SOV_DEFAULT) = 0
> 6419: read(7, " 4\0\0\0\n 5 . 0 . 5 1\0".., 16384) = 56
> 6419: brk(0x00047AF8) = 0
> 6419: brk(0x00049AF8) = 0
> 6419: brk(0x00049AF8) = 0
> 6419: brk(0x0004BAF8) = 0
> 6419: stat64("/usr/local/mysql/share/mysql/charsets/Index.xml",
> 0xFFBFDB08) = 0
> 6419: brk(0x0004BAF8) = 0
> 6419: brk(0x0004FAF8) = 0
> 6419: open64("/usr/local/mysql/share/mysql/charsets/Index.xml",
> O_RDONLY) = 8
> 6419: read(8, " < ? x m l v e r s i o".., 18173) = 18173
> 6419: close(8) = 0
> 6419: brk(0x0004FAF8) = 0
> 6419: brk(0x00051AF8) = 0
> 6419: brk(0x00051AF8) = 0
> 6419: brk(0x00053AF8) = 0
> 6419: write(7, " C\0\001\rA2\0\0\0\0\0 @".., 71) = 71
> 6419: read(7, " W\0\002FF1504 # 2 8 0 0".., 16384) = 91
> 6419: shutdown(7, SHUT_RDWR, SOV_DEFAULT) = 0
> 6419: close(7) = 0
> 6419: getpid() = 6419 [6405]
> 6419: open("/proc/6419/psinfo", O_RDONLY) = 7
> 6419: read(7, "02\0\0\0\0\0\001\0\01913".., 336) = 336
> 6419: close(7) = 0
> 6419: fstat(-1, 0xFFBFE140) Err#9 EBADF
> 6419: open("/dev/conslog", O_WRONLY) = 7
> 6419: fcntl(7, F_SETFD, 0x00000001) = 0
> 6419: fstat(7, 0xFFBFE140) = 0
> 6419: fstat(7, 0xFFBFEBA0) = 0
> 6419: time() = 1204118219
> 6419: open("/usr/share/lib/zoneinfo/Europe/Vienna", O_RDONLY) = 8
> 6419: fstat64(8, 0xFFBFDFD0) = 0
> 6419: read(8, " T Z i f\0\0\0\0\0\0\0\0".., 801) = 801
> 6419: close(8) = 0
> 6419: getpid() = 6419 [6405]
> 6419: putmsg(7, 0xFFBFE258, 0xFFBFE24C, 0) = 0
> 6419: open("/var/run/syslog_door", O_RDONLY) = 8
> 6419: door_info(8, 0xFFBFE190) = 0
> 6419: getpid() = 6419 [6405]
> 6419: door_call(8, 0xFFBFE178) = 0
> 6419: close(8) = 0
> 6419: read(6, "\n\n H E L L O\n P R O T".., 511) = 511
> 6419: Incurred fault #6, FLTBOUNDS %pc = 0xFF20738C
> 6419: siginfo: SIGSEGV SEGV_MAPERR addr=0x44415441
> 6419: Received signal #11, SIGSEGV [caught]
> 6419: siginfo: SIGSEGV SEGV_MAPERR addr=0x44415441
> 6419: schedctl() = 0xFEC9E000
> 6419: lwp_sigmask(SIG_SETMASK, 0x00000000, 0x00000000) = 0xFFBFFEFF
> [0x0000FFFF]
> 6419: _exit(0)
> 6405: Received signal #18, SIGCLD, in accept() [caught]
> 6405: siginfo: SIGCLD CLD_EXITED pid=6419 status=0x0000
> 6405: accept(5, 0xFFBFF554, 0xFFBFF564, SOV_DEFAULT) Err#4 EINTR
> 6405: lwp_sigmask(SIG_SETMASK, 0x00000000, 0x00000000) = 0xFFBFFEFF
> [0x0000FFFF]
> 6405: waitid(P_ALL, 0, 0xFFBFE968, WEXITED|WTRAPPED|WNOHANG) = 0
> 6405: setcontext(0xFFBFE8E8)
> 6405: write(2, " A c c e p t e r r o r", 12) = 12
> 6405: write(2, " : ", 2) = 2
> 6405: write(2, " I n t e r r u p t e d ".., 23) = 23
> 6405: write(2, "\n", 1) = 1
> 6405: shutdown(5, SHUT_RDWR, SOV_DEFAULT) Err#134
> ENOTCONN
> 6405: close(5) = 0
> 6405: _exit(1)
>
> is this a general bug oder has anybody ndoutils running on solaris?
Funny you should mention this as we just found a fix for Solaris for
ndoutils 1.4b3. Note that in the accept call 11 lines up from the
bottom there is an EINTR error from accept. We've patched the call
around the accept so that an EINTR causes a retry and this appears to
work around the problem. See the patch attached. My guess is that this
occurs because the signal is received at the same time that the parent
gets a result on accept, so accept returns with this error rather than
handling the child signal first.
However, I notice that you have a SIGSEGV from the child process 6419.
Our ndoutils (at the older 1.4b3) doesn't give this error. So there
maybe other problems with 1.4b7 on Solaris that also need fixing?
Ton
http://www.altinity.com
UK: +44 (0)870 787 9243
US: +1 866 879 9184
Fax: +44 (0)845 280 1725
Skype: tonvoon
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ndoutils_solaris_eintr_in_accept.patch.txt
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20080227/41bbcb67/attachment.txt>
-------------- next part --------------
-------------- next part --------------
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
-------------- next part --------------
_______________________________________________
Nagios-devel mailing list
Nagios-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-devel
More information about the Developers
mailing list