Distributed Monitoring woes and performance issues.
Jason Rojas
jrojas at shopzilla.com
Tue Nov 8 17:15:17 CET 2005
Here is a good one for you guys.
I am currently monitoring roughly 4357 services on 700 hosts.
Now this is not all the hosts/services I need to be monitoring.
From the output of nagios -s -c nagios.cfg
it tells me that one complete run checking all mentioned services/hosts
will take roughly 885 seconds (14.7 minutes)
Thats bad.
Correct my math if I am wrong. For nagios to check HOSTX it can take up
to 14 minutes., ok, thats kinda bad, lets say after nagios checks HOSTX,
HOSTX decides to die, Well then it can take up to an additional
14minutes to notice it is down.
Which in turn gives me a huge huge response time on downed machines.
This is bad in my case seeing as to how I am not even monitoring
everything yet.
I have tested and tested my config and it seems that no matter what I do
nagios is just not going to cut it.
I can either raise the check interval from every 5 minutes to every 15,
but that still gives me latency issues.
When I realized this I decided that a distributed setup would be the way
to go seeing as to how my company is deploying multiple co-locations, I
do have my master server storing the data in a mysql DB, the problem
with being distributed is that you cannot have the remote (scan only
nodes) send data back via nsca because nagios is pulling data from the
db. So some hosts dont show etc etc
So I went ahead and modified the distributed nodes to send directly to
the db, not a good idea, there were so many inserts going on it rendered
the database useless and the web interface took forever to load.
Does anyone have any ideas for a solution to this besides an enterprise
grade monitoring system?
-Jason Rojas
-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list