Nagios 3 distributed monitoring and NSCA
Frederik Vanhee
fvanhee at gmail.com
Wed Sep 24 20:19:48 CEST 2008
Marc Powell wrote:
> On Sep 10, 2008, at 2:45 PM, Jonathan Call wrote:
>
>
>> In Nagios 2.x Nagios the Obessive Compulsive Service Processor
>> (OCSP) is
>> not very robust. Even with a few hundred service checks the OCSP stuff
>> on the distributed servers bogs down and does not send anything out.
>> This forced people like me to use tools like OCP_daemon.
>>
>
> I have to disagree with this as a general statement. I've used Nagios
> 2.x (currenlty .9), sending/receiving thousands of passive results
> every 5 minutes successfully for years. My 'largest' data collector
> (not dedicated to nagios) has all checks completed, or in progress, in
> the 5 minute interval --
>
> Total Services: 2198
> Services Checked: 2198
> Services Scheduled: 2198
> Active Service Checks: 2198
> Passive Service Checks: 0
> Total Service State Change: 0.000 / 6.250 / 0.011 %
> Active Service Latency: 38.626 / 68.765 / 59.834 sec
> Active Service Execution Time: 0.064 / 60.015 / 0.679 sec
> Active Service State Change: 0.000 / 6.250 / 0.011 %
> Active Services Last 1/5/15/60 min: 377 / 1804 / 2198 / 2198
> Passive Service State Change: 0.000 / 0.000 / 0.000 %
> Passive Services Last 1/5/15/60 min: 0 / 0 / 0 / 0
> Services Ok/Warn/Unk/Crit: 2189 / 0 / 0 / 9
> Services Flapping: 0
> Services In Downtime: 0
>
> One of my central receivers (2.9) --
>
> Total Services: 6137
> Services Checked: 6136
> Services Scheduled: 26
> Active Service Checks: 28
> Passive Service Checks: 6109
> Total Service State Change: 0.000 / 17.960 / 0.034 %
> Active Service Latency: 0.000 / 4.686 / 0.346 sec
> Active Service Execution Time: 0.000 / 2.529 / 0.444 sec
> Active Service State Change: 0.000 / 11.970 / 0.428 %
> Active Services Last 1/5/15/60 min: 3 / 3 / 26 / 26
> Passive Service State Change: 0.000 / 17.960 / 0.033 %
> Passive Services Last 1/5/15/60 min: 1104 / 5680 / 6107 / 6107
> Services Ok/Warn/Unk/Crit: 6107 / 1 / 0 / 29
> Services Flapping: 0
> Services In Downtime: 0
>
> One of my central receivers is still running nagios-1.3, with a
> database backend, and even it can keep up --
>
> Passive Checks:
>
> Time Frame Checks Completed
> <= 1 minute: 628 (10.3%)
> <= 5 minutes: 5191 (85.0%)
> <= 15 minutes: 6105 (100.0%)
> <= 1 hour: 6105 (100.0%)
> Since program start: 6108 (100.0%)
>
>
>> Has the OCSP infrastructure improved in Nagios 3? I need it to be
>> robust
>> enough to handle ~2500 service checks.
>>
>
> I'm doing nearly that now with nagios-2.9.
>
> --
> Marc
>
>
>
> -------------------------------------------------------------------------
> This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
> Build the coolest Linux based applications with Moblin SDK & win great prizes
> Grand prize is a trip for two to an Open Source event anywhere in the world
> http://moblin-contest.org/redirect.php?banner_id=100&url=/
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
I can agree with Mark, I use OCSP in a distributed setup with 8000
passive services.
This worked fine on Nagios 1.x, 2.x and 3.0.3
Frederik
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list