QUERY: Obsessive-Compulsive Processors obsessing too much?
bruce
nagios-devel at vicious.dropbear.id.au
Thu Apr 27 18:01:33 CEST 2006
One of the things that I ran into when tracking down my runaway process
issue, is the handling of the obsessive-compulsive options. In
particular, while the service checks are run in parallel, the ocsp_command
is run in series (along with event handlers and so forth, all by
reap_service_checks). To illustrate, a set of 5 services may well be run
at the same time, taking T(service_run_time), but then the
ocsp_commands are run one after another, taking T(N*ocsp_command_time) :
Parallel Service | | | | | T
Check Execution | | | | | ( run_service_checks() )
\ \ | / / I
Reaper \ \|/ /
Interval \ | / M
V ( reap_service_checks )
--- ( my_system ) E
Serial --- ( my_system )
Obsessive-Compulsive --- |
Execution --- V
---
In the setup that I am working on, I have Nagios running at a rate of at
least 1 service check per second, with an ocsp_command to distribute the
results to other machines. Thus, every reaper_interval (10 seconds),
Nagios hangs around for the ocsp_command to finish running for every
service check.
Since the oscp_command is dependent on TCP handshakes to complete, the
time it takes to finish is noticably variable, and thus Nagios continually
gets later and later.
My query is, do I shift my distributed monitoring to be more batched, and
run my distributed monitoring stuff off the periodic execution of
service_perfdata_file_processing_command, or do I change Nagios to run the
oscp_command in a double fork like run_service_check() ?
--==--
Bruce.
-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
More information about the Developers
mailing list