Adaptive Features in 2.0
Omeara, Randy
randy.omeara at lmco.com
Thu May 1 18:15:33 CEST 2003
Thank you all, for your responses. Sorry for my delayed response.
>>>>
From: Ethan Galstad [mailto:nagios at nagios.org]
I'd be more inclined to improve retention support across restarts
than add the ability to add/remove objects during runtime. The
overhead of doing so (consistency checking) doesn't make sense,
especially when Nagios is designed to do that when it (re)starts.
Retention in 2.0 has been improved (e.g. flap detection is now
retained), so I'd be inclined to focus efforts there. Restarting
Nagios with a SIGHUP shouldn't take more than a few seconds and the
only real thing that's lost is scheduling information (which is
recalculated at startup).
<<<<
Improved retention would be a big plus also. I arrived at my initial request
as I thought about the application of Nagios where there might be many
thousands of objects, and users are able to manage their own objects. I'm
thinking efficiency and scalability here...
(the following is an arbitrary size selection. Scale it up/down as you wish
in order to make it believable in your experience)
So, picture (say) 1,000 groups where each group has: 4 members (people), 5
hosts, and 5 services per host. We're talking 30,000 host+service objects
and (at least) 1,000 contacts, 1,000 contact groups, 1,000 hostgroups,
...etc. Let's just round it out to 35,000 objects, about 25,000 of which are
being actively monitored.
For each change, Nagios has to be run with the verification option to check
for errors, and once validated, Nagios must be hup'ed to restart with the
new change. In the best case, assume that the UI that is used to enter the
object changes will never allow an unacceptable change to make it to the
object validation execution phase of Nagios. For each change, Nagios goes
through its object check twice. That is, 70,000 objects are parsed and
checked.
Now, assume that objects are being defined and modified at a rate of
10/minute (might be high, but not extremely so). For a one minute period,
Nagios validates and reloads 700,000 objects.
I can't say how much hardware and processing power would be required to run
this type of site right now. I only know that, with the changes I proposed,
the resources required to do the same work would be reduced by a factor of
70,000 (700,000 operations versus 10).
Worthwhile?
Randy
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
More information about the Developers
mailing list