Event Handlers Problem
Lewis Getschel
lgetschel at denver.westerngeco.slb.com
Thu May 5 01:25:38 CEST 2005
I'll chip in my 3 cents on this topic...
I don't see which version of Nagios you are running, My comments here
are based on my Nagios 1.2 server
I found 2 different issues that prevented my event_handlers from running:
1)
When I tackled event_handlers I had written in tcsh script, I found a
similar result to you, it LOOKED like it was being called (from event
log), but even the simple "echo $1 >/tmp/nagios_debug.txt" didn't do
anything. I finally tracked it down to (on MY system at least) Nagios
wouldn't execute /bin/tcsh scripts! I rewrote to /bin/sh instead, and it
started to work.
2)
This one had me baffled for 3 months (!).
My event handler command called for sending 6 variables (command_line
$USER1$/event_handler_diskmail.original $SERVICESTATE$ $STATETYPE$
$SERVICEATTEMPT$ $HOSTNAME$ $SERVICEDESC$ $OUTPUT$). I had similar
result, it didn't run.
I "solved" it when:
I changed it to 1 parameter, the script ran.
I added all 6, it stopped again.
I changed it to just 2, it ran again
I changed it to all 6, it stopped (see the pattern here <smirk>)
I ended up building it up 1 parameter at a time until I got to 5. It ran.
When I tried adding the output, it stopped again. I gave up at that
point and left it at 5 (originally 6 months ago it did work with all 6,
then it mysteriously stopped), my event handler doesn't handle 1
particular case now (I can live with that for now).
Try changing your parameters to constants (that was my first hint of
trouble shooting) 1 2 3 and see if the script gets the constants at least.
My 3 cents, FWIW
Lewis
Thomas Beecher wrote:
> I am restarting the service after every change, so that's not an
> issue. Learned that the hard way about 6 months ago...;-p
>
> When I change to this:
>
> command_line /bin/echo "$HOSTNAME$ $HOSTSTATE$" > /tmp/testing
>
> I do get the appropriate variables passed to the temp file. So it
> would appear that the event handler is at least being called, that's
> one less thing to look at!!
>
> This is where I'm at now. If I do this:
>
> /usr/local/nagios/libexec/restart_pm3.pl $HOSTNAME$ $HOSTSTATE$ >>
> /tmp/testing
>
> from a command line, I get the proper output for the script. (normally
> won't be any, but I inserted some for troubleshooting.) If I call this
> from the event_handler, I only get the macro variables, nothing from
> the script.
>
>
> Thomas Beecher II
> Network Administrator
> LocalNet, Inc
> tbeecher at localnet.com
>
> Marc Powell wrote:
>
>>
>>> -----Original Message-----
>>> From: nagios-users-admin at lists.sourceforge.net [mailto:nagios-users-
>>> admin at lists.sourceforge.net] On Behalf Of Thomas Beecher
>>> Sent: Wednesday, May 04, 2005 10:57 AM
>>> To: nagios-users at lists.sourceforge.net
>>> Subject: Re: [Nagios-users] Event Handlers Problem
>>>
>>> Well, that was a serious brain fart on my part!!
>>>
>>> I moved the script to /usr/local/nagios/libexec/, and changed
>>> checkcommands.cfg to show:
>>>
>>> define command{
>>> command_name restart_pm3
>>> command_line $USER1$/restart_pm3.pl $HOSTNAME$ $HOSTSTATE$
>>> }
>>>
>>> $USER1$ is defined in resource.cfg as
>>>
>>> $USER1$=/usr/local/nagios/libexec
>>>
>>> Permissions on the file are:
>>>
>>> -rwxr-xr-x 1 nagios nagios 1701 2005-05-04 10:47 restart_pm3.pl
>>>
>>> I've changed the ownership to nagios/nagios to prevent any other
>>> potential permission issues.
>>
>>
>>
>> All Good.
>>
>>
>>> This returns the following:
>>>
>>> [1115221615] HOST ALERT: buftest;DOWN;SOFT;1;Telnet: CRITICAL - Socket
>>> timeout after 1 seconds<br>SNMP: CRITICAL: snmpget returned errors: 1
>>
>>
>> (
>>
>>> )<br>PING: CRITICAL - Host Unreachable (10.0.2.152)
>>> [1115221615] HOST EVENT HANDLER: buftest;DOWN;SOFT;1;restart_pm3
>>> [1115221625] HOST ALERT: buftest;DOWN;SOFT;2;Telnet: CRITICAL - Socket
>>> timeout after 1 seconds<br>SNMP: CRITICAL: snmpget returned errors: 1
>>
>>
>> (
>>
>>> )<br>PING: CRITICAL - Host Unreachable (10.0.2.152)
>>> [1115221625] HOST EVENT HANDLER: buftest;DOWN;SOFT;2;restart_pm3
>>> [1115221634] HOST ALERT: buftest;DOWN;SOFT;3;Telnet: CRITICAL - Socket
>>> timeout after 1 seconds<br>SNMP: CRITICAL: snmpget returned errors: 1
>>
>>
>> (
>>
>>> )<br>PING: CRITICAL - Host Unreachable (10.0.2.152)
>>> [1115221634] HOST EVENT HANDLER: buftest;DOWN;SOFT;3;restart_pm3
>>>
>>> It doesn't error out, and seems to call the script, but it still
>>
>>
>> doesn't
>>
>>> run.
>>
>>
>>
>> Did you remember to restart Nagios after making the config changes?
>>
>>
>>> I have not tested the script as the Nagios user, however the front of
>>> the script is set to dump whatever args get passed to it out to a file
>>> before doing anything else, so if it was choking somwhere in the
>>
>>
>> script
>>
>>> it would still be logged that it ran.
>>
>>
>>
>> Testing as the user the script is running as should be done anyway.
>> Often times it shows other, less obvious issues like required libraries
>> not being readable or executable by that user. I'd also suggest
>> simplifying things greatly by making your event handler --
>>
>> command_line /bin/echo "$HOSTNAME$ $HOSTSTATE$" > /tmp/testing
>>
>> Just to make sure that it's getting run (I'm 99% sure that'll work as I
>> expect ;) ).
>>
>> --
>> Marc
>
--
Lewis Getschel | Today is done...
WesternGeco | Today was fun...
1625 Broadway | Tomorrow is another one.
Denver, CO 80202 |
Direct Phone - 303-389-4407| -- Dr. Seuss --
-------------------------------------------------------
This SF.Net email is sponsored by: NEC IT Guy Games.
Get your fingers limbered up and give it your best shot. 4 great events, 4
opportunities to win big! Highest score wins.NEC IT Guy Games. Play to
win an NEC 61 plasma display. Visit http://www.necitguy.com/?r=20
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list