Anyone? : SIGSEGV when trying to use eventhandler
nagios
nagios at chadmail.com
Wed May 19 14:15:26 CEST 2010
Thanks for the reply Guy.
I installed nagios from the "contrib" repo. as per the blog here...and all
looks like it's working fine. This is the most recent version in the repo's.
I guess I could compile it from source, but would like to be sure that's the
issue before I go down this path.
http://blogs.sun.com/baiken/entry/nagios_installation_guide_for_opensolaris
And this additional information....
nas at nas:/usr/nagios/libexec/eventhandler# ldd /usr/nagios/bin/nagios
libm.so.2 => /lib/libm.so.2
libpthread.so.1 => /lib/libpthread.so.1
libdl.so.1 => /lib/libdl.so.1
libc.so.1 => /lib/libc.so.1
Installed Nagios from contrib. repo.
Summary: Host/service/network monitoring program
Size: 12.19 MB
Category: None
Installed Version: 3.0.6,5.11-0.111
Latest Version: 3.0.6,5.11-0.111
Packaging Date: Tue Oct 27 16:14:19 2009
FMRI: pkg:/nagios at 3.0.6,5.11-0.111:20091027T161419Z
Repository: contrib
re: trace...Solaris does have dtrace...and it's supposed to be pretty good,
but I'd need to read up a heap to understand how to use it.
Any more ideas folks?
-----Original Message-----
From: Guy Waugh <guidosh at gmail.com>
To: Nagios Users List <nagios-users at lists.sourceforge.net>
Date: Wed, 19 May 2010 11:07:52 +0100
Subject: Re: [Nagios-users] Anyone? : SIGSEGV when trying to use
eventhandler
I'm definitely no expert but...
* What does it say when you 'ldd' the nagios binary? Are all the libraries
the binary is linked against able to be found? Are those libraries
up-to-date?
* Where did you get nagios from? Did you compile it or is it pre-built? If
pre-built, are there any updates?
* I don't know Solaris well enough to know how to trace your running nagios
with a very simple configuration, but that might be the next step. strace?
On 19 May 2010 10:49, nagios <nagios at chadmail.com> wrote:
Anybody?
If you need extra information, just let me know what you need to see and
I'll upload it.
Thanks.
-----Original Message-----
From: "nagios" <nagios at chadmail.com>
To: nagios-users at lists.sourceforge.net
Date: Wed, 19 May 2010 01:42:15 +1000
Subject: [Nagios-users] SIGSEGV when trying to use eventhandler
Hi guys,
I am new to nagios but so far it's working well for me and is monitoring
a number of real and virtual hosts. Nagios 3.0.6 is installed on an
OpenSolaris 2009.06 host and monitoring routers other devices and VM's in
VirtualBox.
My issue is when I try to add an event handler, I get a SIGSEGV and nagios
restarts.
I have posted the details of the code I am using and the error
here...http://pastebin.com/vBb7xTND and also below (but it reads better @
pastebin).
I have tried several different scripts and code combinations (even empty
scripts and commands like ls) and all give the same error.
Can anyone help me work out why it's happening?
Thanks.
hosts.cfg
<snip>
define host{
use windows-server ; Inherit default values from a template
host_name Server6 ; The name we're giving to this host
max_check_attempts 4
event_handler vboxmanage-restart ; Restart the vm
alias Server 6 - Win2008 Server ; A longer name associated with the host
address 192.168.0.6 ; IP address of the host
}
<snip>
commands.cfg - note I have tried various scripts here incl. ones from the
nagios guides/books and all give the same error.
<snip>
# 'vboxmanage_restart' command definition
define command{
command_name vboxmanage-restart
# command_line ls
command_line sudo -u nas $USER1$/eventhandler/event_vboxmanage_restart -S
$SERVICESTATE$ -T $SERVICESTATETYPE$ -A $SERVICEATTEMPT$ -H Server6
}
<snip>
nagios.log
[1274193005] HOST ALERT: Server6;DOWN;SOFT;1;PING CRITICAL - Packet loss =
100%
[1274193005] Caught SIGSEGV, shutting down...
[1274193005] Nagios 3.0.6 starting... (PID=5231)
[1274193005] Local time is Wed May 19 00:30:05 EST 2010
[1274193005] LOG VERSION: 2.0
[1274193005] Finished daemonizing... (New PID=5232)
the scripts... (yes I know it should not be 777's but just to show it's not
a permissions thing)
-rwxrwxrwx 1 nagios nagios 1580 2010-05-18 00:52 event_vboxmanage_restart
-rwxrwxrwx 1 nagios nagios 3815 2010-05-18 23:07 filename.out
-rwxrwxrwx 1 nagios nagios 2211 2010-05-19 00:23 restart-httpd
nas at nas:/usr/nagios/libexec/eventhandler#
The script work fine from the user nagios using sudo (added nagios to
/etc/sudoers)
nas at nas:…sr/nagios/libexec/eventhandler$ whoami
nagios
nas at nas:…sr/nagios/libexec/eventhandler$ sudo -u nas
./event_vboxmanage_restart -S CRITICAL -T HARD -A 1 -H Server6
CRITICAL(C) 2005-2010 Sun Microsystems, Inc.
The event_vboxmanage_restart script...no that this is likely to be at fault
(I do not think anyway as I get the error with other very simple scripts
too).
#!/usr/bin/perl
use Getopt::Long;
use Net::Telnet ();
use Switch;
my ($state,$type,$attempt,$cmd,$hostname);
open(MYOUTFILE, ">>/usr/nagios/libexec/eventhandler/filename.out");
&processargs;
print "$state";
switch ($state) {
case "OK" { &state_OK }
case "WARNING" { &state_WARNING }
case "UNKNOWN" { &state_UNKNOWN }
case "CRITICAL" { &state_CRITICAL }
else { print "unrecognised state>$state" }
}
print MYOUTFILE">$state<";
print MYOUTFILE">$hostname<";
close(MYOUTFILE);
exit 0;
sub processargs {
GetOptions (
"S|state=s" => \$state,
"T|type=s" => \$type,
"A|attempt=i" => \$attempt,
"H|hostname=s" => \$hostname,
"C|command=s" => \$cmd,
);
}
### FUNC: print $state
sub print_state {
}
### FUNC: print $state
sub state_OK {
}
### FUNC: print $state
sub state_WARNING {
}
### FUNC: print $state
sub state_UNKNOWN {
}
### FUNC: print $state
sub state_CRITICAL {
if ("$type" eq "HARD" or ("$type" eq "SOFT" and $attempt == 3))
{@result=`VBoxManage controlvm $hostname acpipowerbutton`; foreach (@result)
{
print MYOUTFILE"$_\n";
};sleep(60);@result=`VBoxManage controlvm $hostname poweroff`;foreach
(@result) {
print MYOUTFILE"$_\n";
}; @result=`VBoxManage startvm $hostname`; print "$result[1]";
}
else { }
}
As you can see from the below, it all works fine (ie. no SIGSEGV's) if I
comment out the eventhandler line from the hosts.cfg file.
[05-19-2010 01:33:50] SERVICE ALERT:
Server6;Explorer;OK;HARD;1;Explorer.EXE: Running
[05-19-2010 01:32:50] SERVICE ALERT: Server6;Uptime;OK;HARD;1;System Uptime
- 0 day(s) 0 hour(s) 9 minute(s)
[05-19-2010 01:32:40] SERVICE ALERT: Server6;C:\ Drive Space;OK;HARD;1;c:\ -
total: 39.90 Gb - used: 9.19 Gb (23%) - free 30.71 Gb (77%)
[05-19-2010 01:32:10] SERVICE ALERT: Server6;CPU Load;OK;HARD;1;CPU Load 3%
(5 min average)
[05-19-2010 01:25:00] HOST ALERT: Server6;UP;SOFT;4;PING OK - Packet loss =
0%, RTA = 0.44 ms
[05-19-2010 01:23:50] SERVICE ALERT:
Server6;Explorer;CRITICAL;HARD;1;Connection refused
[05-19-2010 01:23:50] HOST ALERT: Server6;DOWN;SOFT;3;PING CRITICAL - Packet
loss = 100%
[05-19-2010 01:23:00] SERVICE ALERT: Server6;Uptime;CRITICAL;HARD;1;CRITICAL
- Socket timeout after 10 seconds
[05-19-2010 01:22:50] SERVICE ALERT: Server6;C:\ Drive
Space;CRITICAL;HARD;1;CRITICAL - Socket timeout after 10 seconds
[05-19-2010 01:22:30] HOST ALERT: Server6;DOWN;SOFT;2;PING CRITICAL - Packet
loss = 100%
[05-19-2010 01:22:20] SERVICE ALERT: Server6;CPU
Load;CRITICAL;HARD;1;CRITICAL - Socket timeout after 10 seconds
[05-19-2010 01:21:10] HOST ALERT: Server6;DOWN;SOFT;1;PING CRITICAL - Packet
loss = 100%
[05-19-2010 01:21:00] SERVICE ALERT: Server6;Uptime;CRITICAL;SOFT;1;CRITICAL
- Socket timeout after 10 seconds
[05-19-2010 01:20:50] SERVICE ALERT: Server6;C:\ Drive
Space;CRITICAL;SOFT;1;CRITICAL - Socket timeout after 10 seconds
[05-19-2010 01:02:10] SERVICE ALERT: Server6;CPU Load;OK;SOFT;1;CPU Load 0%
(5 min average)
[05-19-2010 01:00:50] SERVICE ALERT: Server6;Uptime;OK;SOFT;1;System Uptime
- 0 day(s) 0 hour(s) 57 minute(s)
[05-19-2010 01:00:40] SERVICE ALERT: Server6;C:\ Drive Space;OK;SOFT;1;c:\ -
total: 39.90 Gb - used: 9.19 Gb (23%) - free 30.71 Gb (77%)
------------------------------------------------------------------------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting
any issue.
::: Messages without supporting info will risk being sent to /dev/null
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20100519/6753aa19/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list