help: nrpe + event handler
Arun G Nair
arungnair at gmail.com
Thu Nov 19 10:13:47 CET 2009
Hi,
We use red5 to stream media files on our website. Sometimes the RSS
reaches 3.2GB of memory and it stops serving requests. Hence I've to
restart the service when ever it reaches the limit. I was doing this
manually for sometime and today I decided to make an event handler for
this. We use Nagios 2.6 on Debian etch. So on the nagios server I
created a service definition
define service{
use system-service
host_name flash1-server
service_description check-red5-process-memory-size
event_handler restart_red5
check_command check_nrpe_1arg!check_red5_proc_mem
}
And the command definition for the event handler:
define command{
command_name restart_red5
command_line check_nrpe -H $HOSTADDRESS$ -c restart_red5
-a $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$
}
Now, on the red5 server I've defined the commands as:
#check RSS of red5
command[check_red5_proc_mem]=/usr/lib/nagios/plugins/check_procs -C
java -a 'red5' -w 2900000 -c 3100000 --metric=RSS
# event handler to restart red5
command[restart_red5]=/usr/local/bin/restart_red5.sh
I've added 'nagios' to /etc/sudoers:
red5:~# grep 'nagios' /etc/sudoers
nagios ALL= NOPASSWD: /etc/init.d/red5 restart
Below is the restart_red5.sh script:
-------------------
#!/bin/sh
# Event handler for nagios
# restarts red5 daemon when its memory consumption reaches 3GB
# args passed: $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$
/usr/bin/logger -t nrpe "$@"
case "$1" in
OK) ;;
WARNING) ;;
UNKNOWN) ;;
CRITICAL)
case "$2" in
SOFT)
case "$3" in
3)
/usr/bin/logger -t nrpe 'Restarting red5 server...'
/usr/bin/sudo /etc/init.d/red5 restart
/usr/bin/pgrep java && /usr/bin/logger -t nrpe
'Successfully restarted red5 server...'
;;
esac
;;
HARD)
/usr/bin/logger -t nrpe 'Restarting red5 server...'
/usr/bin/sudo /etc/init.d/red5 restart
/usr/bin/pgrep java && /usr/bin/logger -t nrpe
'Successfully restarted red5 server...'
;;
esac
;;
esac
echo 'Done'
exit 0
-------------------
And now about the issue I have. From the nagios log on the server, I
can see that the event handler gets invoked. But on the red5 server,
nothing happens. As you can see from the script, I log to syslog.
Nothing shows up in the log. If I manually issue the check_nrpe
command from the nagios server, I just get 'Done' as reply, whatever
the values I pass as argument.
nagios:~# /usr/lib/nagios/plugins/check_nrpe -H red5_server -c
restart_red5 -a CRITICAL HARD 3
Done
logger doesn't print the arguments passed to the syslog as instructed
in the script. If I print $# (the number of args), its always 0. My
guess is that the arguments $SERVICESTATE$ $SERVICESTATETYPE$
$SERVICEATTEMPT$ are not getting passed to the script. I have tried
passing the arguments in the service definition itself
define service{
use system-service
host_name flash1-server
service_description check-red5-process-memory-size
event_handler restart_red5!$SERVICESTATE$
$SERVICESTATETYPE$ $SERVICEATTEMPT$
check_command check_nrpe_1arg!check_red5_proc_mem
}
define command{
command_name restart_red5
command_line check_nrpe -H $HOSTADDRESS$ -c restart_red5 -a $ARG1$
}
It didn't have any effect as well.
What am I doing wrong here ? Am sure its something simple, but I can't
seem to get to it. Please help.
TIA
-Arun
--
...Keep Smiling...
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list