Grappling with check_by_ssh problems. Long and boring.
Stanley Hopcroft
Stanley.Hopcroft at IPAustralia.Gov.AU
Tue Feb 25 01:07:59 CET 2003
Dear Ladies and Gentlemen,
I am writing to invite your comments on how to deal with a check_by_ssh
service check that intermittently is reported by Nag as in the UNKNOWN
state - despite the plugin output suggesting state is OK and
check_by_ssh run from the CLI (as the Nag user) reporting OK (ie
check_by_ssh ... && echo $? displays '0' after the plugin output).
Trussing (I'd like to bind and gag it) the sh -c /usr/bin/ssh hostname
.. shows indeed that the shell is returning an unhelpful (bogon) return
code so that Nag behaviour in reporting it as UNKNOWN is correct.
close(4) = 0 (0x0)
select(0x9,0x8069370,0x8069380,0x0,0x0) = 1 (0x1)
write(3,0x806d000,32) = 32 (0x20)
select(0x9,0x8069370,0x8069380,0x0,0x0) = 1 (0x1)
read(0x3,0xbfbfdaa0,0x2000) = 64 (0x40)
select(0x9,0x8069370,0x8069380,0x0,0x0) = 1 (0x1)
write(6,0x8074000,17) = 17 (0x11)
select(0x9,0x8069370,0x8069380,0x0,0x0) = 1 (0x1)
read(0x3,0xbfbfdaa0,0x2000) = 64 (0x40)
close(6) = 0 (0x0)
close(8) = 0 (0x0)
select(0x9,0x8069370,0x8069380,0x0,0x0) = 1 (0x1)
write(3,0x806d000,32) = 32 (0x20)
ioctl(0,TIOCGETA,0xbfbffa74) ERR#19 'Operation not
supported by device'
fcntl(0x0,0x3,0x0) = 4 (0x4)
fcntl(0x0,0x4,0x0) ERR#19 'Operation not
supported by device'
ioctl(1,TIOCGETA,0xbfbffa74) ERR#25 'Inappropriate
ioctl for device'
fcntl(0x1,0x3,0x0) = 6 (0x6)
fcntl(0x1,0x4,0x2) = 0 (0x0)
ioctl(2,TIOCGETA,0xbfbffa74) ERR#25 'Inappropriate
ioctl for device'
fcntl(0x2,0x3,0x0) = 6 (0x6)
fcntl(0x2,0x4,0x2) = 0 (0x0)
gettimeofday(0xbfbffa94,0x0) = 0 (0x0)
shutdown(0x3,0x2) = 0 (0x0)
close(3) = 0 (0x0)
exit(0xffffffff) process exit, rval =
65280
It seems like the next step forward is to truss the check process on the
remote server and see if it is in fact the origin of the
exit(0xffffffff).
FWIW, I have found that gdb -p <nag_pid> is not worth doing since for
this gdb (GNU gdb 4.18 (FreeBSD)), attempting to finish the debug
session with the 'detach' gdb command - that should let the process
continue - leaves to Nag terminating without the status log being saved.
gdb is obviously sending a signal to Nag that it doesn't handle so the
default action is occuring.
What on earth could be generating the exit(0xffffffff) - exit with
extreme prejudice ? - system call.
Right
main() {
exit(-1) ;
}
does the same thing.
... snip ..
sigaction(SIGILL,0xbfbffb14,0xbfbffafc) = 0 (0x0)
sigprocmask(0x1,0x0,0x2805c8fc) = 0 (0x0)
sigaction(SIGILL,0xbfbffafc,0x0) = 0 (0x0)
sigprocmask(0x1,0x2805c8c0,0xbfbffb3c) = 0 (0x0)
sigprocmask(0x3,0x2805c8d0,0x0) = 0 (0x0)
exit(0xffffffff) process exit, rval =
65280
Thank you,
Yours sincerely.
--
------------------------------------------------------------------------
Stanley Hopcroft
------------------------------------------------------------------------
'...No man is an island, entire of itself; every man is a piece of the
continent, a part of the main. If a clod be washed away by the sea,
Europe is the less, as well as if a promontory were, as well as if a
manor of thy friend's or of thine own were. Any man's death diminishes
me, because I am involved in mankind; and therefore never send to know
for whom the bell tolls; it tolls for thee...'
from Meditation 17, J Donne.
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list