debugging eventhandler via nrpe

Kaiwang Chen kaiwang.chen at gmail.com
Mon Nov 15 19:39:11 CET 2010


Guess the "-t 200" option got you , because

check_nrpe_nonssl!solr-restart!$SERVICESTATE$ $SERVICESTATETYPE$
$SERVICEATTEMPT$!-t200

will be processed in two steps:

1) macro expansion in args
solr-restart!CRITICAL SOFT 1!-t 200
ARG1=solr-restart
ARG2=CRITICAL SOFT 1
ARG3=-t 200

2) macro expansion the whole raw line,
2.1) say the difinition be "check_nrpe -H $HOSTADDRESS$ -n -c $ARG1$
-a $ARG2$ $ARG3$"
check_nrpe -H <ip> -n -c solr-restart -a "CRITICAL SOFT 1" "-t 200"
2.2) or, say it be "check_nrpe -H $HOSTADDRESS$ -n -c $ARG1$ -a $ARG2$"
check_nrpe -H <ip> -n -c solr-restart -a "CRITICAL SOFT 1"
That is, if more than n args are specified in the case of ARGn being
the last positional macro, the additional parts are simply ignored.
In any case, "-t 200" is not used as expected, while the default is 10 seconds.

Then the check_nrpe generates a query to nrpe:
solr-restart!CRITICAL SOFT 1!-t 200
or
solr-restart!CRITICAL SOFT 1

And nrpe invoke the defined command as
/usr/local/nagios/libexec/eventhandlers/restart-solr.sh "CRITICAL SOFT
1" "-t 200"
or
/usr/local/nagios/libexec/eventhandlers/restart-solr.sh "CRITICAL SOFT 1"

given the definition
command[solr-restart]=/usr/local/nagios/libexec/eventhandlers/restart-solr.sh
$ARG1$ $ARG2$
or
command[solr-restart]=/usr/local/nagios/libexec/eventhandlers/restart-solr.sh
$ARG1$

Similary, if more than n args are specified in the case of ARGn being
the last positional macro, the additional parts are simply ignored.


The above analysis is not verified.

thanks,
kc

2010/11/15 Assaf Flatto <nagios at flatto.net>:
> Nagios.log  show that the event handler is being executed ,and showing
> the passed parameters
>
>
> from nagios.log
>
> <HOST>;JBoss Port
> 8080;CRITICAL;SOFT;1;check_nrpe_nonssl!solr-restart!CRITICAL SOFT 1!-t 200
>
>
>
> the "-t 200" is to tell nagios to wait for the reply for 200 seconds -
> on the remote host we have a timeout of 300 , so that will allow the
> nrpe session time to work the command and then terminate .
>
>
> The debug log is not showing the event handler at all , i have changed
> the logging to -1 and hopefully it will give me some more data .
>
>
> Assaf
>
>
> quanta wrote:
>
>>  Check your nagios.log file. Why didn't you put the argument (-t 200) to
>> the remote host, in nrpe.cfg?
>>
>>
>> On 11/12/2010 01:21 PM, Assaf Flatto wrote:
>>
>>>   Hello all
>>>
>>> I am trying to implement an event handler on a remote machine , and
>>> having a problem with the way the status arguments are transferred over
>>> the NRPE channel .
>>>
>>> My config is as such :
>>>
>>> define service{
>>> <snip>
>>>   max_check_attempts              3
>>>   event_handler
>>> check_nrpe_nonssl!solr-restart!$SERVICESTATE$ $SERVICESTATETYPE$
>>> $SERVICEATTEMPT$!-t 200
>>>
>>> <snip>
>>> }
>>>
>>> On the remote server the nrpe was compiled with --allow-command-args
>>> ,and in  the nrpe.cfg i have the following parameters ?
>>>
>>> dont_blame_nrpe=1
>>> debug=1
>>> command_timeout=300
>>> # Event Handler
>>> command[solr-restart]=/usr/local/nagios/libexec/eventhandlers/restart-solr.sh
>>>
>>>
>>> the event handler script is :
>>> #!/bin/bash
>>> #
>>> # Event handler script for restarting the web server on the local machine
>>> #
>>> # Note: This script will only restart the web server if the service is
>>> # retried 3 times (in a "soft" state) or if the web service somehow
>>> # manages to fall into a "hard" error state.
>>> #
>>>
>>> case "$1" in
>>> OK)
>>>   # The service just came back up, so don't do anything...
>>>   ;;
>>> WARNING)
>>>   ;;
>>> UNKNOWN)
>>>   ;;
>>> CRITICAL)
>>>   case "$2" in
>>>   SOFT)
>>>
>>>   case "$3" in
>>>   2)
>>>   echo "Too early - not restarting yet"
>>>   exit 0
>>>   ;;
>>>      esac
>>>
>>>   case "$3" in
>>>   3)
>>>   echo "Too early - not restarting yet"
>>>   exit 0
>>>   ;;
>>>   esac
>>>   ;;
>>>   HARD)
>>> cd /usr/local/nagios/libexec/eventhandlers/
>>> curl -s -v -u *****:******* --request PUT -d @solr7down.xml http://LB
>>> sleep 5
>>> sudo /etc/init.d/jboss stop
>>>
>>> sleep 60
>>> sudo /etc/init.d/jboss start
>>> sleep 15
>>> curl -s -v -u *****:****** --request PUT -d @solr7up.xml http://LB
>>> sleep 3
>>> echo " Event handler restarted the solr service"
>>>   ;;
>>>   esac
>>>   ;;
>>> esac
>>> echo "Event handler restarted the solr service"
>>> exit 0
>>>
>>> I can see in the syslog that the script is initiated :
>>> Nov 11 18:15:44 gbc1-solr-07 nrpe[29687]: Host address is in allowed_hosts
>>> Nov 11 18:15:44 gbc1-solr-07 nrpe[29687]: Handling the connection...
>>> Nov 11 18:15:44 gbc1-solr-07 nrpe[29687]: Host is asking for command
>>> 'solr-restart' to be run...
>>> Nov 11 18:15:44 gbc1-solr-07 nrpe[29687]: Running command:
>>> /usr/local/nagios/libexec/eventhandlers/restart-solr.sh
>>> Nov 11 18:15:44 gbc1-solr-07 nrpe[29687]: Command completed with return
>>> code 0 and output:
>>>
>>> But the event handler is not preforming the tasks it is meant to .
>>>
>>>
>>> when i initiate the command manually
>>>
>>> ~/libexec/check_nrpe -H <host> -n -c solr-restart -a " Critical HARD 3"
>>> -t 200

Fine? the query sent to nrpe is:
solr-restart! Critical HARD 3!-t!200

>>> it is running fine .
>>> I know it is an issue with the transfer of the arguments to the nrpe -
>>> but i am missing something to make sure they are parsed and sent properly .
>>>
>>> Anyone can point me to the sign in front of me i am blindingly missing ?
>>>
>>> Thanks
>>  to /dev/null
>>
>
>
> --
> Never,Ever Cut A Deal With a Dragon
>
>
> Next year I will be doing the London to Paris bike ride to
> raise money for the DogTrust (www.dogstrust.co.uk) .
> Please Sponsor me at http://www.justgiving.com/Assaf-Flatto
>
>
> ------------------------------------------------------------------------------
> Centralized Desktop Delivery: Dell and VMware Reference Architecture
> Simplifying enterprise desktop deployment and management using
> Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
> client virtualization framework. Read more!
> http://p.sf.net/sfu/dell-eql-dev2dev
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>

------------------------------------------------------------------------------
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list