SIGSEGV when trying to use eventhandler
nagios
nagios at chadmail.com
Tue May 18 17:42:15 CEST 2010
Hi guys,
I am new to nagios but so far it's working well for me and is monitoring a
number of real and virtual hosts. Nagios 3.0.6 is installed on an
OpenSolaris 2009.06 host and monitoring routers other devices and VM's in
VirtualBox.
My issue is when I try to add an event handler, I get a SIGSEGV and nagios
restarts.
I have posted the details of the code I am using and the error
here...http://pastebin.com/vBb7xTND and also below (but it reads better @
pastebin).
I have tried several different scripts and code combinations (even empty
scripts and commands like ls) and all give the same error.
Can anyone help me work out why it's happening?
Thanks.
hosts.cfg
<snip>
define host{
use windows-server ; Inherit default values from a
template
host_name Server6 ; The name we're giving to this host
max_check_attempts 4
event_handler vboxmanage-restart ; Restart the vm
alias Server 6 - Win2008 Server ; A longer name
associated with the host
address 192.168.0.6 ; IP address of the host
}
<snip>
commands.cfg - note I have tried various scripts here incl. ones from the
nagios guides/books and all give the same error.
<snip>
# 'vboxmanage_restart' command definition
define command{
command_name vboxmanage-restart
# command_line ls
command_line sudo -u nas
$USER1$/eventhandler/event_vboxmanage_restart -S $SERVICESTATE$ -T
$SERVICESTATETYPE$ -A $SERVICEATTEMPT$ -H Server6
}
<snip>
nagios.log
[1274193005] HOST ALERT: Server6;DOWN;SOFT;1;PING CRITICAL - Packet loss =
100%
[1274193005] Caught SIGSEGV, shutting down...
[1274193005] Nagios 3.0.6 starting... (PID=5231)
[1274193005] Local time is Wed May 19 00:30:05 EST 2010
[1274193005] LOG VERSION: 2.0
[1274193005] Finished daemonizing... (New PID=5232)
the scripts... (yes I know it should not be 777's but just to show it's not
a permissions thing)
-rwxrwxrwx 1 nagios nagios 1580 2010-05-18 00:52 event_vboxmanage_restart
-rwxrwxrwx 1 nagios nagios 3815 2010-05-18 23:07 filename.out
-rwxrwxrwx 1 nagios nagios 2211 2010-05-19 00:23 restart-httpd
nas at nas:/usr/nagios/libexec/eventhandler#
The script work fine from the user nagios using sudo (added nagios to
/etc/sudoers)
nas at nas:…sr/nagios/libexec/eventhandler$ whoami
nagios
nas at nas:…sr/nagios/libexec/eventhandler$ sudo -u nas
./event_vboxmanage_restart -S CRITICAL -T HARD -A 1 -H Server6
CRITICAL(C) 2005-2010 Sun Microsystems, Inc.
The event_vboxmanage_restart script...no that this is likely to be at fault
(I do not think anyway as I get the error with other very simple scripts
too).
#!/usr/bin/perl
use Getopt::Long;
use Net::Telnet ();
use Switch;
my ($state,$type,$attempt,$cmd,$hostname);
open(MYOUTFILE, ">>/usr/nagios/libexec/eventhandler/filename.out");
&processargs;
print "$state";
switch ($state) {
case "OK" { &state_OK }
case "WARNING" { &state_WARNING }
case "UNKNOWN" { &state_UNKNOWN }
case "CRITICAL" { &state_CRITICAL }
else { print "unrecognised state>$state" }
}
print MYOUTFILE">$state<";
print MYOUTFILE">$hostname<";
close(MYOUTFILE);
exit 0;
sub processargs {
GetOptions (
"S|state=s" => \$state,
"T|type=s" => \$type,
"A|attempt=i" => \$attempt,
"H|hostname=s" => \$hostname,
"C|command=s" => \$cmd,
);
}
### FUNC: print $state
sub print_state {
}
### FUNC: print $state
sub state_OK {
}
### FUNC: print $state
sub state_WARNING {
}
### FUNC: print $state
sub state_UNKNOWN {
}
### FUNC: print $state
sub state_CRITICAL {
if ("$type" eq "HARD" or ("$type" eq "SOFT" and $attempt == 3))
{@result=`VBoxManage controlvm $hostname acpipowerbutton`; foreach (@result)
{
print MYOUTFILE"$_\n";
};sleep(60);@result=`VBoxManage controlvm $hostname poweroff`;foreach
(@result) {
print MYOUTFILE"$_\n";
}; @result=`VBoxManage startvm $hostname`; print "$result[1]";
}
else { }
}
As you can see from the below, it all works fine (ie. no SIGSEGV's) if I
comment out the eventhandler line from the hosts.cfg file.
[05-19-2010 01:33:50] SERVICE ALERT:
Server6;Explorer;OK;HARD;1;Explorer.EXE: Running
[05-19-2010 01:32:50] SERVICE ALERT: Server6;Uptime;OK;HARD;1;System Uptime
- 0 day(s) 0 hour(s) 9 minute(s)
[05-19-2010 01:32:40] SERVICE ALERT: Server6;C:\ Drive Space;OK;HARD;1;c:\ -
total: 39.90 Gb - used: 9.19 Gb (23%) - free 30.71 Gb (77%)
[05-19-2010 01:32:10] SERVICE ALERT: Server6;CPU Load;OK;HARD;1;CPU Load 3%
(5 min average)
[05-19-2010 01:25:00] HOST ALERT: Server6;UP;SOFT;4;PING OK - Packet loss =
0%, RTA = 0.44 ms
[05-19-2010 01:23:50] SERVICE ALERT:
Server6;Explorer;CRITICAL;HARD;1;Connection refused
[05-19-2010 01:23:50] HOST ALERT: Server6;DOWN;SOFT;3;PING CRITICAL - Packet
loss = 100%
[05-19-2010 01:23:00] SERVICE ALERT: Server6;Uptime;CRITICAL;HARD;1;CRITICAL
- Socket timeout after 10 seconds
[05-19-2010 01:22:50] SERVICE ALERT: Server6;C:\ Drive
Space;CRITICAL;HARD;1;CRITICAL - Socket timeout after 10 seconds
[05-19-2010 01:22:30] HOST ALERT: Server6;DOWN;SOFT;2;PING CRITICAL - Packet
loss = 100%
[05-19-2010 01:22:20] SERVICE ALERT: Server6;CPU
Load;CRITICAL;HARD;1;CRITICAL - Socket timeout after 10 seconds
[05-19-2010 01:21:10] HOST ALERT: Server6;DOWN;SOFT;1;PING CRITICAL - Packet
loss = 100%
[05-19-2010 01:21:00] SERVICE ALERT: Server6;Uptime;CRITICAL;SOFT;1;CRITICAL
- Socket timeout after 10 seconds
[05-19-2010 01:20:50] SERVICE ALERT: Server6;C:\ Drive
Space;CRITICAL;SOFT;1;CRITICAL - Socket timeout after 10 seconds
[05-19-2010 01:02:10] SERVICE ALERT: Server6;CPU Load;OK;SOFT;1;CPU Load 0%
(5 min average)
[05-19-2010 01:00:50] SERVICE ALERT: Server6;Uptime;OK;SOFT;1;System Uptime
- 0 day(s) 0 hour(s) 57 minute(s)
[05-19-2010 01:00:40] SERVICE ALERT: Server6;C:\ Drive Space;OK;SOFT;1;c:\ -
total: 39.90 Gb - used: 9.19 Gb (23%) - free 30.71 Gb (77%)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20100519/fa6a902f/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list