Bug report: nagios shutdown removing lock file too early
Ethan Galstad
nagios at nagios.org
Mon Jun 19 22:46:32 CEST 2006
Ton Voon wrote:
> Ethan,
>
> I think I've seen a problem with the nagios shutdown routine. If
> nagios is doing a host check and a INT signal is sent, it seems to
> take a long time before the nagios daemon dies. It looks like the
> child nagios process is trying to complete all the retries for a host
> check before going back into the main loop.
>
> Also, it appears that the lockfile is being removed before the main
> process dies. Below is the output for a 'while true; do ps -p 728; ls
> -l /usr/local/nagios/var/nagios.lock; sleep 1; done' during a kill 728.
>
> [snipped]
> PID TT STAT TIME COMMAND
> 728 ?? Ss 0:01.95 /usr/local/nagios/bin/nagios -d /usr/local/
> nagios/etc/nagios.cfg
> -rw-r--r-- 1 nagios nagios 4 Jun 13 17:20 /usr/local/nagios/var/
> nagios.lock
> PID TT STAT TIME COMMAND
> 728 ?? Ss 0:01.95 /usr/local/nagios/bin/nagios -d /usr/local/
> nagios/etc/nagios.cfg
> -rw-r--r-- 1 nagios nagios 4 Jun 13 17:20 /usr/local/nagios/var/
> nagios.lock
> PID TT STAT TIME COMMAND
> 728 ?? Ss 0:01.95 /usr/local/nagios/bin/nagios -d /usr/local/
> nagios/etc/nagios.cfg
> ls: /usr/local/nagios/var/nagios.lock: No such file or directory
> PID TT STAT TIME COMMAND
> 728 ?? Ss 0:01.95 /usr/local/nagios/bin/nagios -d /usr/local/
> nagios/etc/nagios.cfg
> ls: /usr/local/nagios/var/nagios.lock: No such file or directory
>
> This shows the lockfile gets removed before the main daemon dies.
> (This is from a kill 728, not using any init scripts.) Eventually the
> daemon dies.
>
> I've tested this on Nagios 2.2 on MacOSX 10.4, Nagios 2.0 on Debian
> and Nagios 2.4 on Debian.
>
> Sorry, not had time to delve into the source code.
>
> Ton
>
> http://www.altinity.com
> T: +44 (0)870 787 9243
> F: +44 (0)845 280 1725
> Skype: tonvoon
Yep, this is a bug. Its been present for several years now, so I
suppose we could get around to fixing it. :-) Is the early lockfile
removal causing noticeable problems with anything? The file gets
deleted immediately upon receiving a SIGHUP/etc. to prevent it from
staying around if Nagios has problems shutting down.
Ethan Galstad,
Nagios Developer
---
Email: nagios at nagios.org
Website: http://www.nagios.org
More information about the Developers
mailing list