debugging eventhandler via nrpe
Assaf Flatto
nagios at flatto.net
Fri Nov 12 14:21:43 CET 2010
Hello all
I am trying to implement an event handler on a remote machine , and
having a problem with the way the status arguments are transferred over
the NRPE channel .
My config is as such :
define service{
<snip>
max_check_attempts 3
event_handler
check_nrpe_nonssl!solr-restart!$SERVICESTATE$ $SERVICESTATETYPE$
$SERVICEATTEMPT$!-t 200
<snip>
}
On the remote server the nrpe was compiled with --allow-command-args
,and in the nrpe.cfg i have the following parameters ?
dont_blame_nrpe=1
debug=1
command_timeout=300
# Event Handler
command[solr-restart]=/usr/local/nagios/libexec/eventhandlers/restart-solr.sh
the event handler script is :
#!/bin/bash
#
# Event handler script for restarting the web server on the local machine
#
# Note: This script will only restart the web server if the service is
# retried 3 times (in a "soft" state) or if the web service somehow
# manages to fall into a "hard" error state.
#
case "$1" in
OK)
# The service just came back up, so don't do anything...
;;
WARNING)
;;
UNKNOWN)
;;
CRITICAL)
case "$2" in
SOFT)
case "$3" in
2)
echo "Too early - not restarting yet"
exit 0
;;
esac
case "$3" in
3)
echo "Too early - not restarting yet"
exit 0
;;
esac
;;
HARD)
cd /usr/local/nagios/libexec/eventhandlers/
curl -s -v -u *****:******* --request PUT -d @solr7down.xml http://LB
sleep 5
sudo /etc/init.d/jboss stop
sleep 60
sudo /etc/init.d/jboss start
sleep 15
curl -s -v -u *****:****** --request PUT -d @solr7up.xml http://LB
sleep 3
echo " Event handler restarted the solr service"
;;
esac
;;
esac
echo "Event handler restarted the solr service"
exit 0
I can see in the syslog that the script is initiated :
Nov 11 18:15:44 gbc1-solr-07 nrpe[29687]: Host address is in allowed_hosts
Nov 11 18:15:44 gbc1-solr-07 nrpe[29687]: Handling the connection...
Nov 11 18:15:44 gbc1-solr-07 nrpe[29687]: Host is asking for command
'solr-restart' to be run...
Nov 11 18:15:44 gbc1-solr-07 nrpe[29687]: Running command:
/usr/local/nagios/libexec/eventhandlers/restart-solr.sh
Nov 11 18:15:44 gbc1-solr-07 nrpe[29687]: Command completed with return
code 0 and output:
But the event handler is not preforming the tasks it is meant to .
when i initiate the command manually
~/libexec/check_nrpe -H <host> -n -c solr-restart -a " Critical HARD 3"
-t 200
it is running fine .
I know it is an issue with the transfer of the arguments to the nrpe -
but i am missing something to make sure they are parsed and sent properly .
Anyone can point me to the sign in front of me i am blindingly missing ?
Thanks
--
Never,Ever Cut A Deal With a Dragon
------------------------------------------------------------------------------
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list