Concurrent Service Check Execution
David Knecht
david.knecht at anyweb.ch
Sat Sep 8 21:20:34 CEST 2007
I'd like to force Nagios 2.9 to execute all service checks on a given
monitored system *concurrently* (both in hard OK states as well as in
hard non-OK states). My goal is to see how my services behave on a
particular monitored system *at one single point in time*.
Let me clarify this:
Service checks on monitored system A:
Service check cycle n:
Execution of service check A1 ("check process 1"): 00h:00m:00s
Execution of service check A2 ("check process 2"): 00h:00m:00s
Service check cycle n+1:
Execution of service check A1 ("check process 1"): 00h:05m:00s
Execution of service check A2 ("check process 2"): 00h:05m:00s
...
Service checks on monitored system B:
Service check cycle n:
Execution of service check B1 ("check process 1"): 00h:00m:20s
Execution of service check B2 ("check process 2"): 00h:00m:20s
Service check cycle n+1:
Execution of service check B1 ("check process 1"): 00h:05m:20s
Execution of service check B2 ("check process 2"): 00h:05m:20s
...
Service checks on monitored system C:
Service check cycle n:
Execution of service check C1 ("check process 1"): 00h:01m:49s
Execution of service check C2 ("check process 2"): 00h:01m:49s
Service check cycle n+1:
Execution of service check C1 ("check process 1"): 00h:06m:51s
Execution of service check C2 ("check process 2"): 00h:06m:51s
...
--> As can be seen here, service checks A1 and A2 are executed
concurrently. The same applies to B1/B2 and C1/C2.
--> I doesn't very much matter when these service checks are executed as
long as they are executed concurrently.
According to http://nagios.sourceforge.net/docs/2_0/checkscheduling.html
and http://nagios.sourceforge.net/docs/2_0/images/noninterleaved1.png
non-interleaved checks comes closest to what I want. It seems, though,
that service check execution gets a bit random the longer Nagios is running:
"Even though service checks are initially scheduled to balance the load
on both the local and remote hosts, things will eventually give in to
the ensuing chaos and be a bit random. Reasons for this include the fact
that services are not all checked at the same interval, some services
take longer to execute than others, host and/or service problems can
alter the timing of one or more service checks, etc. At least we try to
get things off to a good start. Hopefully the initial scheduling will
keep the load on the local and remote hosts fairly balanced as time goes
by..."
"Scheduling Delays: It should be noted that service check scheduling and
execution is done on a best effort basis. Individual service checks are
considered to be low priority events in Nagios, so they can get delayed
if high priority events need to be executed. Examples of high priority
events include log file rotations, external command checks, and service
reaper events. Additionally, host checks will slow down the execution
and processing of service checks."
Having mentioned all this I assume that concurrent service checks as
outlined above cannot be configured in both Nagios 2.9 and 3.0. Do I
miss anything here? Is there any circumvention?
--> As a circumvention, it might be acceptable if service check A2 gets
executed ~2-5 seconds after A1. Is it possible to enforce such a behaviour?
Thanks, David
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list