Bug report: nagios shutdown removing lock file too early

Ethan Galstad nagios at nagios.org
Mon Jun 19 22:46:32 CEST 2006


Ton Voon wrote:
> Ethan,
> 
> I think I've seen a problem with the nagios shutdown routine. If  
> nagios is doing a host check and a INT signal is sent, it seems to  
> take a long time before the nagios daemon dies. It looks like the  
> child nagios process is trying to complete all the retries for a host  
> check before going back into the main loop.
> 
> Also, it appears that the lockfile is being removed before the main  
> process dies. Below is the output for a 'while true; do ps -p 728; ls  
> -l /usr/local/nagios/var/nagios.lock; sleep 1; done' during a kill 728.
> 
> [snipped]
>    PID  TT  STAT      TIME COMMAND
>    728  ??  Ss     0:01.95 /usr/local/nagios/bin/nagios -d /usr/local/ 
> nagios/etc/nagios.cfg
> -rw-r--r--   1 nagios  nagios  4 Jun 13 17:20 /usr/local/nagios/var/ 
> nagios.lock
>    PID  TT  STAT      TIME COMMAND
>    728  ??  Ss     0:01.95 /usr/local/nagios/bin/nagios -d /usr/local/ 
> nagios/etc/nagios.cfg
> -rw-r--r--   1 nagios  nagios  4 Jun 13 17:20 /usr/local/nagios/var/ 
> nagios.lock
>    PID  TT  STAT      TIME COMMAND
>    728  ??  Ss     0:01.95 /usr/local/nagios/bin/nagios -d /usr/local/ 
> nagios/etc/nagios.cfg
> ls: /usr/local/nagios/var/nagios.lock: No such file or directory
>    PID  TT  STAT      TIME COMMAND
>    728  ??  Ss     0:01.95 /usr/local/nagios/bin/nagios -d /usr/local/ 
> nagios/etc/nagios.cfg
> ls: /usr/local/nagios/var/nagios.lock: No such file or directory
> 
> This shows the lockfile gets removed before the main daemon dies.  
> (This is from a kill 728, not using any init scripts.) Eventually the  
> daemon dies.
> 
> I've tested this on Nagios 2.2 on MacOSX 10.4, Nagios 2.0 on Debian  
> and Nagios 2.4 on Debian.
> 
> Sorry, not had time to delve into the source code.
> 
> Ton
> 
> http://www.altinity.com
> T: +44 (0)870 787 9243
> F: +44 (0)845 280 1725
> Skype: tonvoon

Yep, this is a bug.  Its been present for several years now, so I 
suppose we could get around to fixing it.  :-)  Is the early lockfile 
removal causing noticeable problems with anything?  The file gets 
deleted immediately upon receiving a SIGHUP/etc. to prevent it from 
staying around if Nagios has problems shutting down.


Ethan Galstad,
Nagios Developer
---
Email: nagios at nagios.org
Website: http://www.nagios.org




More information about the Developers mailing list