[PATCH] Re: alternative scheduler
Fredrik Thulin
ft at it.su.se
Fri Dec 3 10:06:22 CET 2010
On Wed, 2010-12-01 at 15:40 +0100, Fredrik Thulin wrote:
> On Wed, 2010-12-01 at 15:14 +0100, Andreas Ericsson wrote:
> ...
> > > Host checks were still being scheduled, and every time a host check was
> > > found at the front of event_list_low, Nagios would log "We're not
> > > executing host checks right now, so we'll skip this event." and then
> > > sleep for sleep_time seconds (0.25 was my setting, based on (Ubuntu)
> > > defaults) (!!!).
> >
> >
> > This should only happen if you've set a check_interval for hosts but
> > have disabled them globally, either via nagios.cfg or via an external
> > command. It seems weird that we run usleep() instead of just issuing
> > a sched_yield() or something though, which would be a virtual noop
> > unless other processes are waiting to run.
>
> Guilty of setting a check_interval for hosts, even on slave servers,
> yes.
Mea culpa. This sounded so plausible that I confessed right away, but
upon actually looking at my host template (all hosts use this), I don't
see what makes Nagios schedule host checks. This is what I was running
at the time (I've since tried to tune the reaping pass by disabling flap
detection, perf_data, event_handler and notifications on the check slave
servers (without any dramatical improvement)) :
define host {
name SU-generic-host
notifications_enabled 1
event_handler_enabled 1
flap_detection_enabled 1
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
max_check_attempts 10
notification_interval 1
notification_period 24x7
notification_options d,u,r
register 0
}
> > > I made the attached minimalistic patch to not sleep if the next event in
> > > the event list is already due.
> > >
> >
> > Seems sensible, but I think it can be improved, such as issuing either
> > a sched_yield() or, if sched_yield() is not available, running usleep(10)
> > every 100 skipped items or so. That would avoid pinning the cpu but would
> > still be a lot faster than what we have today.
>
> What is sched_yield? I can't find that function anywhere in the source
> code. Feel free to improve the patch - as I've previously said C isn't
> my game.
Since you haven't responded or elaborated on your enhancement
suggestion, how about applying the patch I sent until someone works up
the incentive to improve it further?
> I'll try changing reaping interval to every 2 seconds as per your
> advice, but I guess it will still take 30-40% of the total time.
Tried this. When reaping every 2 seconds, each pass takes ~0.7 seconds
and no real improvement in check latency can be observed.
/Fredrik
------------------------------------------------------------------------------
Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
Tap into the largest installed PC base & get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev
More information about the Developers
mailing list