debugging eventhandler via nrpe
Kaiwang Chen
kaiwang.chen at gmail.com
Mon Nov 15 19:39:11 CET 2010
Guess the "-t 200" option got you , because
check_nrpe_nonssl!solr-restart!$SERVICESTATE$ $SERVICESTATETYPE$
$SERVICEATTEMPT$!-t200
will be processed in two steps:
1) macro expansion in args
solr-restart!CRITICAL SOFT 1!-t 200
ARG1=solr-restart
ARG2=CRITICAL SOFT 1
ARG3=-t 200
2) macro expansion the whole raw line,
2.1) say the difinition be "check_nrpe -H $HOSTADDRESS$ -n -c $ARG1$
-a $ARG2$ $ARG3$"
check_nrpe -H <ip> -n -c solr-restart -a "CRITICAL SOFT 1" "-t 200"
2.2) or, say it be "check_nrpe -H $HOSTADDRESS$ -n -c $ARG1$ -a $ARG2$"
check_nrpe -H <ip> -n -c solr-restart -a "CRITICAL SOFT 1"
That is, if more than n args are specified in the case of ARGn being
the last positional macro, the additional parts are simply ignored.
In any case, "-t 200" is not used as expected, while the default is 10 seconds.
Then the check_nrpe generates a query to nrpe:
solr-restart!CRITICAL SOFT 1!-t 200
or
solr-restart!CRITICAL SOFT 1
And nrpe invoke the defined command as
/usr/local/nagios/libexec/eventhandlers/restart-solr.sh "CRITICAL SOFT
1" "-t 200"
or
/usr/local/nagios/libexec/eventhandlers/restart-solr.sh "CRITICAL SOFT 1"
given the definition
command[solr-restart]=/usr/local/nagios/libexec/eventhandlers/restart-solr.sh
$ARG1$ $ARG2$
or
command[solr-restart]=/usr/local/nagios/libexec/eventhandlers/restart-solr.sh
$ARG1$
Similary, if more than n args are specified in the case of ARGn being
the last positional macro, the additional parts are simply ignored.
The above analysis is not verified.
thanks,
kc
2010/11/15 Assaf Flatto <nagios at flatto.net>:
> Nagios.log show that the event handler is being executed ,and showing
> the passed parameters
>
>
> from nagios.log
>
> <HOST>;JBoss Port
> 8080;CRITICAL;SOFT;1;check_nrpe_nonssl!solr-restart!CRITICAL SOFT 1!-t 200
>
>
>
> the "-t 200" is to tell nagios to wait for the reply for 200 seconds -
> on the remote host we have a timeout of 300 , so that will allow the
> nrpe session time to work the command and then terminate .
>
>
> The debug log is not showing the event handler at all , i have changed
> the logging to -1 and hopefully it will give me some more data .
>
>
> Assaf
>
>
> quanta wrote:
>
>> Check your nagios.log file. Why didn't you put the argument (-t 200) to
>> the remote host, in nrpe.cfg?
>>
>>
>> On 11/12/2010 01:21 PM, Assaf Flatto wrote:
>>
>>> Hello all
>>>
>>> I am trying to implement an event handler on a remote machine , and
>>> having a problem with the way the status arguments are transferred over
>>> the NRPE channel .
>>>
>>> My config is as such :
>>>
>>> define service{
>>> <snip>
>>> max_check_attempts 3
>>> event_handler
>>> check_nrpe_nonssl!solr-restart!$SERVICESTATE$ $SERVICESTATETYPE$
>>> $SERVICEATTEMPT$!-t 200
>>>
>>> <snip>
>>> }
>>>
>>> On the remote server the nrpe was compiled with --allow-command-args
>>> ,and in the nrpe.cfg i have the following parameters ?
>>>
>>> dont_blame_nrpe=1
>>> debug=1
>>> command_timeout=300
>>> # Event Handler
>>> command[solr-restart]=/usr/local/nagios/libexec/eventhandlers/restart-solr.sh
>>>
>>>
>>> the event handler script is :
>>> #!/bin/bash
>>> #
>>> # Event handler script for restarting the web server on the local machine
>>> #
>>> # Note: This script will only restart the web server if the service is
>>> # retried 3 times (in a "soft" state) or if the web service somehow
>>> # manages to fall into a "hard" error state.
>>> #
>>>
>>> case "$1" in
>>> OK)
>>> # The service just came back up, so don't do anything...
>>> ;;
>>> WARNING)
>>> ;;
>>> UNKNOWN)
>>> ;;
>>> CRITICAL)
>>> case "$2" in
>>> SOFT)
>>>
>>> case "$3" in
>>> 2)
>>> echo "Too early - not restarting yet"
>>> exit 0
>>> ;;
>>> esac
>>>
>>> case "$3" in
>>> 3)
>>> echo "Too early - not restarting yet"
>>> exit 0
>>> ;;
>>> esac
>>> ;;
>>> HARD)
>>> cd /usr/local/nagios/libexec/eventhandlers/
>>> curl -s -v -u *****:******* --request PUT -d @solr7down.xml http://LB
>>> sleep 5
>>> sudo /etc/init.d/jboss stop
>>>
>>> sleep 60
>>> sudo /etc/init.d/jboss start
>>> sleep 15
>>> curl -s -v -u *****:****** --request PUT -d @solr7up.xml http://LB
>>> sleep 3
>>> echo " Event handler restarted the solr service"
>>> ;;
>>> esac
>>> ;;
>>> esac
>>> echo "Event handler restarted the solr service"
>>> exit 0
>>>
>>> I can see in the syslog that the script is initiated :
>>> Nov 11 18:15:44 gbc1-solr-07 nrpe[29687]: Host address is in allowed_hosts
>>> Nov 11 18:15:44 gbc1-solr-07 nrpe[29687]: Handling the connection...
>>> Nov 11 18:15:44 gbc1-solr-07 nrpe[29687]: Host is asking for command
>>> 'solr-restart' to be run...
>>> Nov 11 18:15:44 gbc1-solr-07 nrpe[29687]: Running command:
>>> /usr/local/nagios/libexec/eventhandlers/restart-solr.sh
>>> Nov 11 18:15:44 gbc1-solr-07 nrpe[29687]: Command completed with return
>>> code 0 and output:
>>>
>>> But the event handler is not preforming the tasks it is meant to .
>>>
>>>
>>> when i initiate the command manually
>>>
>>> ~/libexec/check_nrpe -H <host> -n -c solr-restart -a " Critical HARD 3"
>>> -t 200
Fine? the query sent to nrpe is:
solr-restart! Critical HARD 3!-t!200
>>> it is running fine .
>>> I know it is an issue with the transfer of the arguments to the nrpe -
>>> but i am missing something to make sure they are parsed and sent properly .
>>>
>>> Anyone can point me to the sign in front of me i am blindingly missing ?
>>>
>>> Thanks
>> to /dev/null
>>
>
>
> --
> Never,Ever Cut A Deal With a Dragon
>
>
> Next year I will be doing the London to Paris bike ride to
> raise money for the DogTrust (www.dogstrust.co.uk) .
> Please Sponsor me at http://www.justgiving.com/Assaf-Flatto
>
>
> ------------------------------------------------------------------------------
> Centralized Desktop Delivery: Dell and VMware Reference Architecture
> Simplifying enterprise desktop deployment and management using
> Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
> client virtualization framework. Read more!
> http://p.sf.net/sfu/dell-eql-dev2dev
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
------------------------------------------------------------------------------
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list