Event Handlers are not runing or logging. (on WARNING or CRITICAL)
Bruce
bruce at webfarm.co.nz
Thu Sep 2 00:26:39 CEST 2004
Hi,
I think my email is not working correctly because im not getting
responses to my questions until I post a follow up (very weird)
Has anyone had any thoughts on my findings below?
Just to refresh the issue,
Originally I thought Event handlers were not running, however I have
since found that the event handlers are running but only when a service
check returns OK when it has been in another state. This is not very
useful since an event handler should be fixing the occurring problems
not trying to fix them after they are manually fixed. Ive included a
log file of one host/service which experiences the problem (qouted
below) so that people can see what I mean,
Any thoughts would be appreciated,
--
+------------------------------------------+ \|||/
| Bruce at WebFarm.co.nz +64 06 7572881 | (o o)
| Systems Technician +---ooO-(_)-Ooo---+
| |
| WebFarm http://www.webfarm.co.nz |
| FreeParking http://www.freeparking.co.nz |
+------------------------------------------------------------+
... FreeParking - NZ's best value Domain, WebHosting and email accounts - bar none
... WebFarm - NZ's eCommerce specialists since 1997
bruce wrote:
>Hi,
>
>Ive done a little more testing and it appears the event handlers ARE
>running but only when the state changes to OK, which of course is no use
>for fixing the problem.
>
>Below is the nagios.log file from one of the live system (well result of:
>egrep 'creeper.*Defun' var/nagios.log), freshclam seems
>to be running on all the severs but the Defunct processes check does get
>some results. The nagios configs are excatly the same for these also (the
>command sends fixdefuncts.sh instead of restartFreshClam.sh and thats the
>only difference.
>
>-- 8<-- nagios.log
>[1093669850] SERVICE ALERT: creeper;Defuncts;OK;HARD;1;OK - 5 processes
>running with STATE = Z
>[1093670146] SERVICE ALERT: creeper;Defuncts;WARNING;HARD;1;WARNING - 6
>processes running with STATE = Z
>[1093673451] SERVICE ALERT: creeper;Defuncts;WARNING;HARD;1;WARNING - 7
>processes running with STATE = Z
>[1093677052] SERVICE ALERT: creeper;Defuncts;WARNING;HARD;1;WARNING - 8
>processes running with STATE = Z
>[1093680652] SERVICE ALERT: creeper;Defuncts;WARNING;HARD;1;WARNING - 10
>processes running with STATE = Z
>[1093684251] SERVICE ALERT: creeper;Defuncts;WARNING;HARD;1;WARNING - 10
>processes running with STATE = Z
>[1093685900] SERVICE ALERT: creeper;Defuncts;CRITICAL;HARD;1;CRITICAL -
>11 processes running with STATE = Z
>[1093687852] SERVICE ALERT: creeper;Defuncts;CRITICAL;HARD;1;CRITICAL -
>11 processes running with STATE = Z
>[1093691451] SERVICE ALERT: creeper;Defuncts;CRITICAL;HARD;1;CRITICAL -
>13 processes running with STATE = Z
>[1093695059] SERVICE ALERT: creeper;Defuncts;CRITICAL;HARD;1;CRITICAL -
>15 processes running with STATE = Z
>[1093696438] SERVICE ALERT: creeper;Defuncts;OK;HARD;1;OK - 0 processes
>running with STATE = Z
>[1093696438] SERVICE EVENT HANDLER:
>creeper;Defuncts;OK;HARD;1;allserver_defunct_fix
>[1093696516] SERVICE ALERT: creeper;Defuncts;OK;HARD;1;OK - 0 processes
>running with STATE = Z
>[1093696624] SERVICE ALERT: creeper;Defuncts;OK;HARD;1;OK - 0 processes
>running with STATE = Z
>[1093696673] SERVICE ALERT: creeper;Defuncts;OK;HARD;1;OK - 0 processes
>running with STATE = Z
>[1093697080] SERVICE ALERT: creeper;Defuncts;OK;HARD;1;OK - 1 processes
>running with STATE = Z
>-- 8<-- End nagios.log
>
>As you can see it goes through the motions, OK => WARNING => CRITICAL =>
>OK (when we mannually restart the offending process on the server, yeah
>the better fix would be to fix the process but we are still investigating
>why it happens :( very weird, but different issue )
>
>When changing from OK => WARNING it dosnt run the event handler, only when
>it goes back to OK does it run.
>
>If I change the event handlers args to be a static CIRITCAL the handler
>logs in and does the restart, so everything is fine there.
>
>Here are the related config sections just for reference of this command
>and service:
>
>define service {
> use hosted
> service_description Defuncts
> check_command serv_check_zombie_procs
>
> event_handler allserver_defunct_fix
> event_handler_enabled 1
> hostgroup_name shared
>}
>define command {
> command_name allserver_defunct_fix
> command_line $USER1$/fix-w-allserver.sh $HOSTADDRESS$ $SERVICESTATE$ $SERVICEATTEMPT$ defunctFix.sh
>}
>
>
>Any thoughts or suggestions?
>
>Cheers,
>
>
-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_id=5047&alloc_id=10808&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list