Nagios 3 distributed monitoring and NSCA
Marc Powell
marc at ena.com
Wed Sep 10 22:27:59 CEST 2008
On Sep 10, 2008, at 2:45 PM, Jonathan Call wrote:
> In Nagios 2.x Nagios the Obessive Compulsive Service Processor
> (OCSP) is
> not very robust. Even with a few hundred service checks the OCSP stuff
> on the distributed servers bogs down and does not send anything out.
> This forced people like me to use tools like OCP_daemon.
I have to disagree with this as a general statement. I've used Nagios
2.x (currenlty .9), sending/receiving thousands of passive results
every 5 minutes successfully for years. My 'largest' data collector
(not dedicated to nagios) has all checks completed, or in progress, in
the 5 minute interval --
Total Services: 2198
Services Checked: 2198
Services Scheduled: 2198
Active Service Checks: 2198
Passive Service Checks: 0
Total Service State Change: 0.000 / 6.250 / 0.011 %
Active Service Latency: 38.626 / 68.765 / 59.834 sec
Active Service Execution Time: 0.064 / 60.015 / 0.679 sec
Active Service State Change: 0.000 / 6.250 / 0.011 %
Active Services Last 1/5/15/60 min: 377 / 1804 / 2198 / 2198
Passive Service State Change: 0.000 / 0.000 / 0.000 %
Passive Services Last 1/5/15/60 min: 0 / 0 / 0 / 0
Services Ok/Warn/Unk/Crit: 2189 / 0 / 0 / 9
Services Flapping: 0
Services In Downtime: 0
One of my central receivers (2.9) --
Total Services: 6137
Services Checked: 6136
Services Scheduled: 26
Active Service Checks: 28
Passive Service Checks: 6109
Total Service State Change: 0.000 / 17.960 / 0.034 %
Active Service Latency: 0.000 / 4.686 / 0.346 sec
Active Service Execution Time: 0.000 / 2.529 / 0.444 sec
Active Service State Change: 0.000 / 11.970 / 0.428 %
Active Services Last 1/5/15/60 min: 3 / 3 / 26 / 26
Passive Service State Change: 0.000 / 17.960 / 0.033 %
Passive Services Last 1/5/15/60 min: 1104 / 5680 / 6107 / 6107
Services Ok/Warn/Unk/Crit: 6107 / 1 / 0 / 29
Services Flapping: 0
Services In Downtime: 0
One of my central receivers is still running nagios-1.3, with a
database backend, and even it can keep up --
Passive Checks:
Time Frame Checks Completed
<= 1 minute: 628 (10.3%)
<= 5 minutes: 5191 (85.0%)
<= 15 minutes: 6105 (100.0%)
<= 1 hour: 6105 (100.0%)
Since program start: 6108 (100.0%)
> Has the OCSP infrastructure improved in Nagios 3? I need it to be
> robust
> enough to handle ~2500 service checks.
I'm doing nearly that now with nagios-2.9.
--
Marc
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list