Grappling with check_by_ssh problems. Long and boring.

Richard Wu Richard.Wu at fusepoint.com
Tue Feb 25 01:38:08 CET 2003




Looks like the "UNKNOWN" return code only apply to certain remote clients. In my environment, remote Linux return correct status while remote Solaris do not.

Richard Wu



-----Original Message-----
From: Stanley Hopcroft [mailto:Stanley.Hopcroft at IPAustralia.Gov.AU]
Sent: Monday, February 24, 2003 4:08 PM
To: nagios-users at lists.sourceforge.net
Subject: [Nagios-users] Grappling with check_by_ssh problems. Long and
boring.


Dear Ladies and Gentlemen,

I am writing to invite your comments on how to deal with a check_by_ssh 
service check that intermittently is reported by Nag as in the UNKNOWN 
state - despite the plugin output suggesting state is OK and 
check_by_ssh run from the CLI (as the Nag user) reporting OK (ie 
check_by_ssh ... && echo $? displays '0' after the plugin output).

Trussing (I'd like to bind and gag it) the sh -c /usr/bin/ssh hostname 
.. shows indeed that the shell is returning an unhelpful (bogon) return 
code so that Nag behaviour in reporting it as UNKNOWN is correct.

close(4)                                         = 0 (0x0)
select(0x9,0x8069370,0x8069380,0x0,0x0)          = 1 (0x1)
write(3,0x806d000,32)                            = 32 (0x20)
select(0x9,0x8069370,0x8069380,0x0,0x0)          = 1 (0x1)
read(0x3,0xbfbfdaa0,0x2000)                      = 64 (0x40)
select(0x9,0x8069370,0x8069380,0x0,0x0)          = 1 (0x1)
write(6,0x8074000,17)                            = 17 (0x11)
select(0x9,0x8069370,0x8069380,0x0,0x0)          = 1 (0x1)
read(0x3,0xbfbfdaa0,0x2000)                      = 64 (0x40)
close(6)                                         = 0 (0x0)
close(8)                                         = 0 (0x0)
select(0x9,0x8069370,0x8069380,0x0,0x0)          = 1 (0x1)
write(3,0x806d000,32)                            = 32 (0x20)
ioctl(0,TIOCGETA,0xbfbffa74)                     ERR#19 'Operation not 
supported by device'
fcntl(0x0,0x3,0x0)                               = 4 (0x4)
fcntl(0x0,0x4,0x0)                               ERR#19 'Operation not 
supported by device'
ioctl(1,TIOCGETA,0xbfbffa74)                     ERR#25 'Inappropriate 
ioctl for device'
fcntl(0x1,0x3,0x0)                               = 6 (0x6)
fcntl(0x1,0x4,0x2)                               = 0 (0x0)
ioctl(2,TIOCGETA,0xbfbffa74)                     ERR#25 'Inappropriate 
ioctl for device'
fcntl(0x2,0x3,0x0)                               = 6 (0x6)
fcntl(0x2,0x4,0x2)                               = 0 (0x0)
gettimeofday(0xbfbffa94,0x0)                     = 0 (0x0)
shutdown(0x3,0x2)                                = 0 (0x0)
close(3)                                         = 0 (0x0)
exit(0xffffffff)                                process exit, rval = 
65280

It seems like the next step forward is to truss the check process on the 
remote server and see if it is in fact the origin of the 
exit(0xffffffff).

FWIW, I have found that gdb -p <nag_pid> is not worth doing since for 
this gdb (GNU gdb 4.18 (FreeBSD)), attempting to finish the debug 
session with the 'detach' gdb command - that should let the process 
continue - leaves to Nag terminating without the status log being saved.

gdb is obviously sending a signal to Nag that it doesn't handle so the 
default action is occuring.

What on earth could be generating the exit(0xffffffff) - exit with 
extreme prejudice ? - system call.

Right

main() {
  exit(-1) ;
}

does the same thing.

  ... snip ..
sigaction(SIGILL,0xbfbffb14,0xbfbffafc)          = 0 (0x0)
sigprocmask(0x1,0x0,0x2805c8fc)                  = 0 (0x0)
sigaction(SIGILL,0xbfbffafc,0x0)                 = 0 (0x0)
sigprocmask(0x1,0x2805c8c0,0xbfbffb3c)           = 0 (0x0)
sigprocmask(0x3,0x2805c8d0,0x0)                  = 0 (0x0)
exit(0xffffffff)                                process exit, rval = 
65280

Thank you,

Yours sincerely.



 -- 
------------------------------------------------------------------------
Stanley Hopcroft
------------------------------------------------------------------------

'...No man is an island, entire of itself; every man is a piece of the
continent, a part of the main. If a clod be washed away by the sea,
Europe is the less, as well as if a promontory were, as well as if a
manor of thy friend's or of thine own were. Any man's death diminishes
me, because I am involved in mankind; and therefore never send to know
for whom the bell tolls; it tolls for thee...'

from Meditation 17, J Donne.


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list