Larger networks
Marc Powell
mpowell at ena.com
Wed Sep 17 19:02:36 CEST 2003
How do you handle major network outages? We monitor 1810 hosts and 2348 services currently and it's been my experience that when there is an outage of over approximately 100 hosts that check latency skyrockets and Nagios is pretty useless for realtime status until the problem is resolved and Nagios restarted. It's not unusual for us to have between 20 and 40 hosts down on the network at any given time (construction, circuit issues, end user error, etc.) I've tried modifying my host check command to only ping once but I still had issues. I've been able to get around it by disabling host checking entirely, which is ok for the time being, but I sacrifice some of the more interesting functionality like parenting, etc. I utilize a distributed environment (4 data collectors reporting back to 2 central hosts) and I expect that the passive host checks in 2.0 will help me out significantly but until then I'm kind of stuck.
Don't get me wrong, I love Nagios and it's a wonderfully effective tool. I just wish I could get it just right in my environment when in crisis mode with all the features active ;)
--
Marc
________________________________________
From: Hochberg, Keith [mailto:Keith.Hochberg at mtvi.com]
Sent: Wednesday, September 17, 2003 11:41 AM
To: Gert Lindstrom; Nagios-users at lists.sourceforge.net
Gert,
Read up on nsca and nrpe. I use nsca for a distributed monitoring system... although not as spread out geographically as you.. but I have over 400 hosts defined and 3,300 services. I use 192-bit encryption and have had no issues so far.
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list