Event Handlers Problem
Thomas Beecher
tbeecher at localnet.com
Thu May 19 14:54:28 CEST 2005
Problem solved!!!!
For whatever reason, I had to add /usr/bin/perl to the command definition. So,
instead of this:
$USER1$/restart_pm3.pl $HOSTNAME$ $HOSTSTATE$
it has to be:
/usr/bin/perl $USER1$/restart_pm3.pl $HOSTNAME$ $HOSTSTATE$
This works flawlessly.
It doesn't make a whole lot of sense to me why this is required. The nagios user
can run any other perl script on the system that it has permissions for, and all
have /usr/bin/perl defined as the interpreter path, so it't can't be a
permissions issue with the perl interpreter. Maybe has something to do with this
copy of Nagios having been compiled with the embedded perl module.
Either way, thought I post this up here in case someone else had similar issues.
Thomas Beecher II
Network Administrator
LocalNet Corp.
tbeecher at localnet.com
Thomas Beecher wrote:
> Well, to answer some questions that has been posed.
>
> 1. My testing was taking place on a 2.0b3 installation of Nagios. I was,
> however, able to replicate the behavior on a 1.1 install, our production
> instance, seperate box.
>
> 2. I tested the perl script as the nagios user to make sure it would
> actually run as expected, and it did.
>
> 3. I have tried to call the script from inside shell wrapper, to no
> avail. The shell script works fine if I call it directly (again, tested
> as myself and as the nagios user). I tested it naked, with no params ,
> with constants, and with macro variables, all nada. In all cases, if I
> tee the output to a dummy file to see what the output is, it's ONLY the
> arguments that come out, never anything from the script.
>
> So, In essance, I'm still stuck. I'm cheating right now, taking the only
> output I can get (the macros), dumping that to a file, and running
> another script as a cron job that parses the file holding the marco
> variables, then calling the perl script. I know, it's a roundabout way
> of doing it, but it accomplishes the same task. I don't plan to try that
> on our production copy of nagios, only because there's already enough
> cron jobs running every 5 minutes, but thought I'd toss it out there in
> case someone else wants to try that for themselves.
>
> Thanks to everyone for all your help, if I ever get it working the
> correct way I'll be sure to post again and let you know.
>
>
> Thomas Beecher II
> Network Administrator
> LocalNet, Inc
> tbeecher at localnet.com
>
> Lewis Getschel wrote:
>
>> I'll chip in my 3 cents on this topic...
>>
>> I don't see which version of Nagios you are running, My comments here
>> are based on my Nagios 1.2 server
>>
>> I found 2 different issues that prevented my event_handlers from running:
>> 1)
>> When I tackled event_handlers I had written in tcsh script, I found a
>> similar result to you, it LOOKED like it was being called (from event
>> log), but even the simple "echo $1 >/tmp/nagios_debug.txt" didn't do
>> anything. I finally tracked it down to (on MY system at least) Nagios
>> wouldn't execute /bin/tcsh scripts! I rewrote to /bin/sh instead, and
>> it started to work.
>>
>> 2)
>> This one had me baffled for 3 months (!).
>> My event handler command called for sending 6 variables
>> (command_line $USER1$/event_handler_diskmail.original
>> $SERVICESTATE$ $STATETYPE$ $SERVICEATTEMPT$ $HOSTNAME$ $SERVICEDESC$
>> $OUTPUT$). I had similar result, it didn't run.
>> I "solved" it when:
>> I changed it to 1 parameter, the script ran.
>> I added all 6, it stopped again.
>> I changed it to just 2, it ran again
>> I changed it to all 6, it stopped (see the pattern here <smirk>)
>> I ended up building it up 1 parameter at a time until I got to 5. It ran.
>> When I tried adding the output, it stopped again. I gave up at that
>> point and left it at 5 (originally 6 months ago it did work with all
>> 6, then it mysteriously stopped), my event handler doesn't handle 1
>> particular case now (I can live with that for now).
>>
>> Try changing your parameters to constants (that was my first hint of
>> trouble shooting) 1 2 3 and see if the script gets the constants at
>> least.
>>
>> My 3 cents, FWIW
>> Lewis
>>
>>
>> Thomas Beecher wrote:
>>
>>> I am restarting the service after every change, so that's not an
>>> issue. Learned that the hard way about 6 months ago...;-p
>>>
>>> When I change to this:
>>>
>>> command_line /bin/echo "$HOSTNAME$ $HOSTSTATE$" > /tmp/testing
>>>
>>> I do get the appropriate variables passed to the temp file. So it
>>> would appear that the event handler is at least being called, that's
>>> one less thing to look at!!
>>>
>>> This is where I'm at now. If I do this:
>>>
>>> /usr/local/nagios/libexec/restart_pm3.pl $HOSTNAME$ $HOSTSTATE$ >>
>>> /tmp/testing
>>>
>>> from a command line, I get the proper output for the script.
>>> (normally won't be any, but I inserted some for troubleshooting.) If
>>> I call this from the event_handler, I only get the macro variables,
>>> nothing from the script.
>>>
>>>
>>> Thomas Beecher II
>>> Network Administrator
>>> LocalNet, Inc
>>> tbeecher at localnet.com
>>>
>>> Marc Powell wrote:
>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: nagios-users-admin at lists.sourceforge.net [mailto:nagios-users-
>>>>> admin at lists.sourceforge.net] On Behalf Of Thomas Beecher
>>>>> Sent: Wednesday, May 04, 2005 10:57 AM
>>>>> To: nagios-users at lists.sourceforge.net
>>>>> Subject: Re: [Nagios-users] Event Handlers Problem
>>>>>
>>>>> Well, that was a serious brain fart on my part!!
>>>>>
>>>>> I moved the script to /usr/local/nagios/libexec/, and changed
>>>>> checkcommands.cfg to show:
>>>>>
>>>>> define command{
>>>>> command_name restart_pm3
>>>>> command_line $USER1$/restart_pm3.pl $HOSTNAME$ $HOSTSTATE$
>>>>> }
>>>>>
>>>>> $USER1$ is defined in resource.cfg as
>>>>>
>>>>> $USER1$=/usr/local/nagios/libexec
>>>>>
>>>>> Permissions on the file are:
>>>>>
>>>>> -rwxr-xr-x 1 nagios nagios 1701 2005-05-04 10:47 restart_pm3.pl
>>>>>
>>>>> I've changed the ownership to nagios/nagios to prevent any other
>>>>> potential permission issues.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> All Good.
>>>>
>>>>
>>>>> This returns the following:
>>>>>
>>>>> [1115221615] HOST ALERT: buftest;DOWN;SOFT;1;Telnet: CRITICAL - Socket
>>>>> timeout after 1 seconds<br>SNMP: CRITICAL: snmpget returned errors: 1
>>>>
>>>>
>>>>
>>>>
>>>> (
>>>>
>>>>> )<br>PING: CRITICAL - Host Unreachable (10.0.2.152)
>>>>> [1115221615] HOST EVENT HANDLER: buftest;DOWN;SOFT;1;restart_pm3
>>>>> [1115221625] HOST ALERT: buftest;DOWN;SOFT;2;Telnet: CRITICAL - Socket
>>>>> timeout after 1 seconds<br>SNMP: CRITICAL: snmpget returned errors: 1
>>>>
>>>>
>>>>
>>>>
>>>> (
>>>>
>>>>> )<br>PING: CRITICAL - Host Unreachable (10.0.2.152)
>>>>> [1115221625] HOST EVENT HANDLER: buftest;DOWN;SOFT;2;restart_pm3
>>>>> [1115221634] HOST ALERT: buftest;DOWN;SOFT;3;Telnet: CRITICAL - Socket
>>>>> timeout after 1 seconds<br>SNMP: CRITICAL: snmpget returned errors: 1
>>>>
>>>>
>>>>
>>>>
>>>> (
>>>>
>>>>> )<br>PING: CRITICAL - Host Unreachable (10.0.2.152)
>>>>> [1115221634] HOST EVENT HANDLER: buftest;DOWN;SOFT;3;restart_pm3
>>>>>
>>>>> It doesn't error out, and seems to call the script, but it still
>>>>
>>>>
>>>>
>>>>
>>>> doesn't
>>>>
>>>>> run.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Did you remember to restart Nagios after making the config changes?
>>>>
>>>>
>>>>> I have not tested the script as the Nagios user, however the front of
>>>>> the script is set to dump whatever args get passed to it out to a file
>>>>> before doing anything else, so if it was choking somwhere in the
>>>>
>>>>
>>>>
>>>>
>>>> script
>>>>
>>>>> it would still be logged that it ran.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Testing as the user the script is running as should be done anyway.
>>>> Often times it shows other, less obvious issues like required libraries
>>>> not being readable or executable by that user. I'd also suggest
>>>> simplifying things greatly by making your event handler --
>>>>
>>>> command_line /bin/echo "$HOSTNAME$ $HOSTSTATE$" > /tmp/testing
>>>>
>>>> Just to make sure that it's getting run (I'm 99% sure that'll work as I
>>>> expect ;) ).
>>>>
>>>> --
>>>> Marc
>>>
>>>
>>>
>>
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by Oracle Space Sweepstakes
> Want to be the first software developer in space?
> Enter now for the Oracle Space Sweepstakes!
> http://ads.osdn.com/?ad_id=7393&alloc_id=16281&op=click
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue. ::: Messages without supporting info will risk
> being sent to /dev/null
-------------------------------------------------------
This SF.Net email is sponsored by Oracle Space Sweepstakes
Want to be the first software developer in space?
Enter now for the Oracle Space Sweepstakes!
http://ads.osdn.com/?ad_id=7412&alloc_id=16344&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list