Alleviating Nagios i/o contention problem
Frost, Mark {PBC}
mark.frost1 at pepsico.com
Sat Sep 25 14:30:09 CEST 2010
Greetings, listers,
We've got an on-going issue with i/o contention. There's the obvious problem that we've got a whole lot of things all writing to the same partition. In this case, there's just one big chunk of RAID 5 disk on a single controller so I don't believe that making more partitions is going to help.
On this same partition we have:
1) Nagios 3.2.1 running as the central/reporting server for a couple of other Nagios nodes that are sending check results via NSCA. Approximately 6-7K checks.
2) pnp4nagios 0.6.2 (with rrd 1.4.2) writing graph data.
There's a 2nd server configured identically to the first that's acting as a "hot spare" so it also receives check data from the 2 distributed nodes and writes its own copy of the graph data locally as well.
At the moment I'm concerned about the graphdata, but because I can only see i/o utilization as an aggregate, I can't tell what is the worst component on that filesystem -- status.dat updates? graph data? writes to the var/spool directory? We also look at continued growth so this is only going to get worse.
These systems are quite lightly loaded from a CPU (2 dual-core CPUs) and memory (4GB) perspective, but the i/o to the nagios filesystem is queuing now.
We're about to order new hardware for these servers and I want to make a reasonable choice. I'd like to make some reasonable changes without requiring too exotic of a setup. I believe these servers are currently Dell 2950s and they're all running Suse Linux 10.3 SP2.
My first thought was to potentially move the graphs to a NAS share which would shift that i/o to the network. I don't know how that would work though and it would ultimately be an experiment.
What experiences do people out there have handling this kind of i/o and what have you done to ease it?
Thanks very much!
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20100925/eec3be45/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list