nagios and apan cause server to crash...

Matthew Wilson matthewwilson at dsl.pipex.com
Mon Oct 6 17:15:04 CEST 2003


Hi guys,
I have read in the list archives in the last couple of months a few threads about nagios and apan chewing up memory.  I have tried a few of the solutions posted but still have no joy.  Here's my situation:

Hardware - Dell Poweredge PIII 500 w/ 374MB memory, SCSI hard drive
Software - RedHat 9, Nagios 1.1 (installed from RPM) and apan.  The only other stuff of any signifance running is ssh and httpd (for nagios alone)

Nagios is checking about 50 service on 50 hosts using apan to ping every 5 mins.  The machine will run for about 5 days and then seize up to point that it requires power cycling.  (This is particular pain as I administer the box remotely ;-) Rebooting will sort it out and after getting nagios going again the same will happen.    Looking at my /var/log/messages, I see the kernel shutting down processes due to a lack of memory.  However, it's not nagios that is shut down but httpd (which normally consumes about 8MB).  It's a little difficult for me to see what's actually happeninng as I look after this machine on a contract remotely, and nobody is watching it 24-7.  Watching free on a daily basis I gradually see free memory decreasing and buffer and cache sizes increasing, (to the point where there is only about 16MB memory free) but I understand this is what linux is supposed to do.  My nagios process doesn't apparently grow (or not when the machine is stable) and is normmaly using about 1.5MB. 
Here's what I have tried:

- Reducing max concurrent checks to 20.  No apparent affect other than giving big check latencies.
- Daily restart of nagios process via cron job - no apparent affect

any helping in diagnosing/fixing this much appreciated.  Also please bear in mind that I only learnt linux for this project, so may be missing something obvious ;-)

cheers
Matthew Wilson
DCSat.net
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20031006/b33ece19/attachment.html>


More information about the Users mailing list