Larger networks

Hochberg, Keith Keith.Hochberg at mtvi.com
Wed Sep 17 20:31:29 CEST 2003


Marc,

Unfortunately, I have the same issue.  When I have network outages and more than 15 hosts are down I do experience check latency and nsca does go crazy, things don't return to normal until I restart.  I did disable host checking for many of my hosts to try and combat this but we do lose some functionality as you say.  I am hoping with 2.0 accepting passive host checks this does get better but I would not discourage anyone from using nagios in a distibuted manner because of these issues that occur once in a while.

I think I remember seeing Ethan saying the docs were being written and he was expecting to release it in mid Oct. so we'll see!

-Keith

-----Original Message-----
From: Marc Powell [mailto:mpowell at ena.com] 
Sent: Wednesday, September 17, 2003 1:03 PM
To: Hochberg, Keith; Nagios-users at lists.sourceforge.net
Subject: RE: [Nagios-users] Larger networks



How do you handle major network outages? We monitor 1810 hosts and 2348 services currently and it's been my experience that when there is an outage of over approximately 100 hosts that check latency skyrockets and Nagios is pretty useless for realtime status until the problem is resolved and Nagios restarted. It's not unusual for us to have between 20 and 40 hosts down on the network at any given time (construction, circuit issues, end user error, etc.) I've tried modifying my host check command to only ping once but I still had issues. I've been able to get around it by disabling host checking entirely, which is ok for the time being, but I sacrifice some of the more interesting functionality like parenting, etc. I utilize a distributed environment (4 data collectors reporting back to 2 central hosts) and I expect that the passive host checks in 2.0 will help me out significantly but until then I'm kind of stuck.

Don't get me wrong, I love Nagios and it's a wonderfully effective tool. I just wish I could get it just right in my environment when in crisis mode with all the features active ;)

--
Marc
________________________________________
From: Hochberg, Keith [mailto:Keith.Hochberg at mtvi.com] 
Sent: Wednesday, September 17, 2003 11:41 AM
To: Gert Lindstrom; Nagios-users at lists.sourceforge.net

Gert,
 
Read up on nsca and nrpe.  I use nsca for a distributed monitoring system... although not as spread out geographically as you.. but I have over 400 hosts defined and 3,300 services.  I use 192-bit encryption and have had no issues so far.  
 


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list