Multithreaded Macro Support wrapper proposal
Ton Voon
ton.voon at opsera.com
Sat Aug 22 08:48:17 CEST 2009
Hi Steven,
On 21 Aug 2009, at 18:35, Steven D. Morrey wrote:
> To that end I have decided to robustify the macro system by creating
> a handful of wrapper functions that will make the macros thread safe
> (as long as all macro calls are passed through them).
> These functions are
Taking a different approach, which part of the macro setting routines
is taking the most time? My guess is that the summary macros takes the
most time because it has to walk through the entire list of hosts and
services. http://nagios.sourceforge.net/docs/3_0/macrolist.html
You could disable summary macro processing with the large installation
tweaks (http://nagios.sourceforge.net/docs/3_0/
largeinstalltweaks.html) and see if the timings still show the macro
portion to be causing the bottleneck. I think you are on Nagios 2
though, so this option is not available. You could try just commenting
out that entire block and see how it affects the profiling.
For Opsview, we found for a customer that their CPU was spinning at
100%. Using strace, we found it was in the notifications logic setting
all the macro environment variables. But we knew that the customer
**didn't have notifications enabled for any contacts**. Turns out that
when nagios got an alert event, it would set macros first, and then
work out if the contact should be notified. We changed the loop so
that it checked if the contact should be notified and then calculated
the macros. This reduced their CPU down to 10%.
Patch for Nagios 2.10: https://secure.opsera.com/svn/opsview/branches/BRAN-2.14/opsview-base/patches/nagios_reduce_notifications_load.patch
Patch for Nagios 3: https://secure.opsera.com/svn/opsview/branches/BRAN-3.1/opsview-base/patches/nagios_reduce_notifications_load.patch
I haven't put this into core code yet because I'm trying to work out a
way to test this. Even though I know this works for the thousands of
users using Opsview, I set myself a different standard when it comes
to the hundreds of thousands of users of Nagios :)
I'd be grateful if anyone wants to write a libtap test that proves
this problem, so then I can get it applied to core code.
Ton
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july
More information about the Developers
mailing list