realistic system requirements and capacity

Carroll, Jim P [Contractor] jcarro10 at sprintspectrum.com
Thu Sep 19 16:57:52 CEST 2002


Hmm.  I don't know FreeBSD well enough to offer expert opinion, but you
might want to investigate tuning the o/s itself.  Possibly there's some sort
of buffer overflow happening...?

Speculating wildly....  Time to pull out all the stops on tuning the o/s.
You might also want to do a truss or strace or whatever the FreeBSD
equivalent is on the master Nagios process, logging it all to a file.
Something to dissect the next time it abends.

If it's even an option, why not burn in a system with Linux and configure
that as your master?  If nothing else, you've eliminated FreeBSD as the
possible culprit.

Yet another option would be Solaris on Intel.  If that's your cuppa tea.

jc

> -----Original Message-----
> From: Jason Ahrens [mailto:Jason.Ahrens at telus.com]
> Sent: Thursday, September 19, 2002 7:49 AM
> To: nagios-users at lists.sourceforge.net
> Subject: RE: [Nagios-users] realistic system requirements and capacity
> 
> 
> We have ~3000 service checks in our system. We use a 
> distributed model with
> 4 FreeBSD Intel boxes doing the actual checks, and a Sun 
> Solaris system
> (E420, 3CPU, 4GB memory) collecting all the results and paging.
> 
> This configuration works well, when it works. Unfortunately 
> Nagios seems to
> have a stability problem on the "master" system and dies 
> frequently. We
> upgraded a day or two ago to 1.0b6 hoping for stability 
> improvements but
> those hopes were not realized. Different systems produce the 
> same result so
> I doubt it's hardware related. As yet, we've been unable to 
> track down *why*
> nagios is crashing and we're having a harder and harder time 
> convincing
> those in power to stick with it. They're starting to look at 
> other options.
> 
> Does anyone know what might be causing our instability here, 
> or how it might
> be fixed? It seems to die while processing all the external 
> returns from the
> farmed out checks. The last lines in the log file are always 
> something like
> this:
> 
> [1031774837] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;
> [1031774837] Caught SIGSEGV, shutting down...
> 
> Jason
> 
> --
> Jason Ahrens, System Analyst
> TELUS Enterprise Solutions
> http://www.telus.com
> 
> 
> -----Original Message-----
> From: ffejes at sears.com [mailto:ffejes at sears.com]
> Sent: Thursday, September 19, 2002 8:13 AM
> To: nagios-users at lists.sourceforge.net
> Subject: Re: [Nagios-users] realistic system requirements and capacity
> 
> 
> 
> Hello.  Our Nagios 1.0b4 system currently has ~250 hosts and 
> ~350 services.
> We are running on a NetBSD 1.5.2 server with a PII/400MHz CPU 
> and 128MB
> RAM.  This machine is also serving HTTP, FTP, and SMB.  
> Lately the load
> average has been hovering around 0.1 and the cpu is virtually 
> idle most of
> the time.  I have not fully investigated distributed servers 
> and, for the
> moment, simply have a smaller sparc linux machine serving as 
> a hot standby.
> 
> Hope this info helps.
> 
> --frank
> 
> 
> 
>  
> 
>                     George Miscioscia
> 
>                     <George.Miscioscia at Ticketmaste       To:
> "Nagios-Users (E-mail)" <nagios-users at lists.sourceforge.net>      
>                     r.com>                               cc:
> 
>                     Sent by:                             Subject:
> [Nagios-users] realistic system requirements and capacity    
>                     nagios-users-admin at lists.sourc
> 
>                     eforge.net
> 
>  
> 
>  
> 
>                     09/18/2002 10:59 AM
> 
>  
> 
>  
> 
> 
> 
> 
> 
> To any and all experienced nagios users, what system 
> requirements would you
> recommend to run a Nagios install at full capacity, and what would you
> consider full capacity before impementing distributed 
> servers? I.E., how
> many checks would you recommend one server perform?
> 
> thanks,
> 
> George Miscioscia
> Manager, Internet Systems
> Ticketmaster/Citysearch
> office (213)739-3521
> cell    (310)902-6743
> 
> 
> 
> -------------------------------------------------------
> This SF.NET email is sponsored by: AMD - Your access to the experts
> on Hammer Technology! Open Source & Linux Developers, register now
> for the AMD Developer Symposium. Code: EX8664
> http://www.developwithamd.com/developerlab
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> 
> 
> 
> 
> 
> -------------------------------------------------------
> This sf.net email is sponsored by:ThinkGeek
> Welcome to geek heaven.
> http://thinkgeek.com/sf
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> 
> 
> 
> -------------------------------------------------------
> This sf.net email is sponsored by:ThinkGeek
> Welcome to geek heaven.
> http://thinkgeek.com/sf
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> 


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf




More information about the Users mailing list