RFC Proof of concenpt patch: Restarting embedded Perl Nagios periodically to halt memory consumption.

Ethan Galstad nagios at nagios.org
Thu Oct 21 04:01:28 CEST 2004


Following up on an old post that needs addressing...

The info from the Perl docs your provided earlier state that lost 
memory can only be freed by stopping the process.  Rather than 
including this functionality in Nagios (which would be a bit kludgy), 
I would recommend another approach (which is still hackish, but 
better IMO)...

Use cron to periodically run a script that calls the check_vsz plugin 
to determine the memory usage of the Nagios daemon.  If the plugin 
returns a critical state, restart Nagios using the init script.  
Simple, but effective.



On 18 Sep 2004 at 12:31, Stanley Hopcroft wrote:

> Dear Ladies and Gentlemen,
> 
> Nag 2.x attempts unsuccessfully (on my bad advice) to limit the
> maximum memory used by the embedded Perl Nag (ePN) process by
> periodically deallocating the Perl interpreter and re-initialising it.
> 
> Since 1.2 is my Nag test bed, these changes were backported to it and
> the negative results noted in a former letter.
> 
> However, changes to the reinit mechanism used by 2.x appear to deal
> with the problem of increasing memory usage by an ePN by _restarting_
> Nagios periodically.
> 
> The changes are
> 
> 1 In utils.c/reinit_embedded_perl(void)
> 
> fork, and in the child process exec the Nag startup script with the
> the 'restart' parameter.
> 
> int reinit_embedded_perl(void){
> 
> #ifdef EMBEDDEDPERL
>         char buffer[MAX_INPUT_BUFFER];
>         pid_t pid ;
> 
>         snprintf(buffer,sizeof(buffer),"Restarting Nagios (to 
> re-initialize embedded Perl interpreter) after %d uses 
> ...\n",embedded_perl_calls);
>         buffer[sizeof(buffer)-1]='\x0';
>         write_to_logs_and_console(buffer,NSLOG_INFO_MESSAGE,TRUE);
> 
>         pid=fork();
> 
>         if(pid==-1)
>                 exit(STATE_UNKNOWN) ;
> 
>         else if(pid==0){
> 
>                 execlp("/usr/local/etc/rc.d/nagios.sh", 
> "/usr/local/etc/rc.d/nagios.sh", "restart", 0) ;
> 
>         } else {
> 
>                 exit(STATE_OK) ;
>         }
> #endif
>         return OK ;
> 
>         }
> 
> 
> 2 Make the Nag startup script suid root.
> 
> 2.1 minor changes to the startup script (to remove the su) and have
> the startup script append debug output to a file.
> 
> As with the 2.x code, reinit_embedded_perl() is called in checks.c
> whenever the number of calls to the embedded interpreter exceeds a
> threshold value.
> 
> It may well be that the restart is better done by the daemon process,
> rather than in a child forked to perform a service check. (This way
> seemed to me to be the fastest way to proceed [since there was already
> 2.x code with this structure)].
> 
> Here is an extract from the Nagios log showing some test results
> 
> [1095429760] Restarting Nagios (to re-initialize embedded Perl 
> interpreter) after 101 uses ...
> [1095429760] Caught SIGTERM, shutting down...
> [1095429760] Nagios 1.2 starting... (PID=83831)
> [1095429760] Successfully shutdown... (PID=81306)
> [1095429760] Finished daemonizing... (New PID=83832)
> 
> [1095430344] Restarting Nagios (to re-initialize embedded Perl 
> interpreter) after 101 uses ...
> [1095430344] Caught SIGTERM, shutting down...
> [1095430344] Successfully shutdown... (PID=83832)
> [1095430344] Nagios 1.2 starting... (PID=86358)
> [1095430344] Finished daemonizing... (New PID=86359)
> 
> I am now testing my prod Nag with this change and a threshold of
> 100_000 checks (should be about a week or a mem usage of 40-60 MB).
> 
> Yours sincerely.
> 
> -- 
> Stanley Hopcroft
> 
> Network specialist, IT Infrastructure
> IP Australia
> Ph: (02) 6283 3189  Fax: (02) 6281 1353
> PO Box 200 Woden  ACT 2606
> http://www.ipaustralia.gov.au
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170
> Project Admins to receive an Apple iPod Mini FREE for your judgement
> on who ports your project to Linux PPC the best. Sponsored by IBM.
> Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php
> _______________________________________________ Nagios-devel mailing
> list Nagios-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-devel
> 
> 



Ethan Galstad,
Nagios Developer
---
Email: nagios at nagios.org
Website: http://www.nagios.org



-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl




More information about the Developers mailing list