Nagios kept from restarting after reboot by lock file
Mike Lindsey
mike-nagios at 5dninja.net
Tue Dec 21 07:37:48 CET 2010
On 12/20/10 8:16 AM, eric.berg at barclayscapital.com wrote:
> Alternatively, could you recommend a good system/resource monitoring tool that would be able to let me know if nagios is down and restart it automatically?
>
Add a cronjob on a five (or whatever you're comfortable with) minute
interval, similar to:
#!/bin/bash
PATH=/bin:/usr/bin:/usr/local/bin
PID=`cat /home/nagios/nagios/var/nagios.lock`
PIDTEST=`kill -0 ${PID} 2>&1 >/dev/null`
if [ "${PIDTEST}" -eq "1" ]
then
rm /home/nagios/nagios/var/nagios.lock
# INSERT RESTART COMMAND HERE
echo "Killed Lockfile and restarted Nagios" | mail -s "Nagios
restart `hostname`" your-email at here.com
fi
>>>
Just be aware that it'll also trigger that if block, if nagios is
running under a different username. You can check for that by doing
some tests in the script with ps and grep.
> _____________________________________________
> From: Berg, Eric: IT (NYK)
> Sent: Monday, December 20, 2010 11:03 AM
> To: 'nagios-users at lists.sourceforge.net'
> Subject: Nagios kept from restarting after reboot by lock file
>
> Gee, this seems like an annoying newbie problem, but if Nagios crashes or is killed (as on system reboot), it leaves a lock file around that prevents it from starting again until the lock file is manually removed.
>
> I see this on Monday mornings after weekend reboots on a Red Hat Linux box:
>
> nagios: Lockfile '/home/nagios/nagios/var/nagios.lock' looks like its already held by another instance of Nagios (PID 0). Bailing out...
Sounds like something in the shutdown process is throwing a 0 into the
pid file, or the startup in the rc script is.
Either way, you should never have a 0 in there, either the rc script is
putting the wrong data in there, or it's reporting incorrectly.
> Does anyone know if there's a config option or something else that obviates the need to write a wrapper scropt to check to see if Nagios is really running and remove the lock file (look slike Nagios already knows it's not running by virtue of the value of the PID inthis very message!) so that it can cleanly start up again?
--
Mike Lindsey
------------------------------------------------------------------------------
Lotusphere 2011
Register now for Lotusphere 2011 and learn how
to connect the dots, take your collaborative environment
to the next level, and enter the era of Social Business.
http://p.sf.net/sfu/lotusphere-d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list