QUERY: Obsessive-Compulsive Processors obsessing too much?

bruce nagios-devel at vicious.dropbear.id.au
Thu Apr 27 18:01:33 CEST 2006

Previous message: nagios custom state for hosts/services
Next message: QUERY: Obsessive-Compulsive Processors obsessing too much?
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

One of the things that I ran into when tracking down my runaway process 
issue, is the handling of the obsessive-compulsive options.  In 
particular, while the service checks are run in parallel, the ocsp_command 
is run in series (along with event handlers and so forth, all by 
reap_service_checks).  To illustrate, a set of 5 services may well be run 
at the same time, taking T(service_run_time), but then the 
ocsp_commands are run one after another, taking T(N*ocsp_command_time) :

   Parallel Service      |   |   |   |   |                                  T
   Check Execution       |   |   |   |   |   ( run_service_checks() )
                           \  \  |  /   /                                   I
   Reaper                    \  \|/   /
   Interval                    \ |  /                                       M
                                 V           ( reap_service_checks )
                                ---          ( my_system )                  E
   Serial                       ---          ( my_system )
   Obsessive-Compulsive         ---                                         |
   Execution                    ---                                         V
                                ---

In the setup that I am working on, I have Nagios running at a rate of at 
least 1 service check per second, with an ocsp_command to distribute the 
results to other machines.  Thus, every reaper_interval (10 seconds), 
Nagios hangs around for the ocsp_command to finish running for every 
service check.

Since the oscp_command is dependent on TCP handshakes to complete, the 
time it takes to finish is noticably variable, and thus Nagios continually 
gets later and later.

My query is, do I shift my distributed monitoring to be more batched, and 
run my distributed monitoring stuff off the periodic execution of 
service_perfdata_file_processing_command, or do I change Nagios to run the 
oscp_command in a double fork like run_service_check() ?

--==--
Bruce.


-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

Previous message: nagios custom state for hosts/services
Next message: QUERY: Obsessive-Compulsive Processors obsessing too much?
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Developers mailing list