More performance optimizations for 3.2.0, this time optimizing startup time

Jonathan Kamens jkamens at advent.com
Thu Jan 28 11:45:42 CET 2010

Previous message: More performance optimizations for 3.2.0, this time optimizing startup time
Next message: Minor documentation issue
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 01/28/2010 03:49 AM, Ton Voon wrote:
> I'm wondering if there is a better way of doing this. Rather than each
> caller having to work out the next comment id value and making sure it
> is still unique, I think the next_comment_id should be manipulated
> within xrddefault.c instead. If you call add_new_comment with a
> comment_id of 0, it uses next_comment_id and ensures it gets incremented.
The right place to control the uniqueness of comment IDs is at the layer
where comments are abstracted, i.e., comments.c.

Xrddefault.c is the wrong place to do this because xrddefault.c is only
one of several possible comment storage layers. Furthermore, it is not
the only place where comments are added; there's also xsddefault.c.

The API exported by comments.c needs to represent and support the fact
that there are actually two different comment-creation cases -- adding a
comment with a known comment ID (i.e., read in from previous state), or
adding a newly created comment that needs to be assigned a comment ID.
The code in comments.c should do the assigning in that case and return
the new comment ID to the caller.

The API syntax for distinguishing these two cases could be as simple as
always passing in a pointer to the comment ID, where the value in said
pointer is 0 if comments.c should assign the next ID, or non-zero if it
is already assigned and should not be changed by the caller.

However, the disadvantage of that simple syntax is that it will allow
bugs to creep into the code, the most likely one being a situation in
which the caller forgets to initialize the comment_id value to 0 before
calling the comment add API, thus causing a comment to be created with a
previously used comment ID. To avoid this, either separate entry
point(s) should be provided for creating comments with predefined
comments ID, or there should be a flag in comments.c that the caller has
to be set which indicates whether the operation currently in progress is
reading in old state or generating new state; when the former is in
progress, the comment ID must always be specified, whereas when the
latter is in progress, it can never be.

This will get a little complicated, e.g., when generating flapping
comments dynamically when reading in retention.dat. You'd have code
that looks like:

...
set_reuse_comment_ids();
... code to read and parse retention.dat ...
if (... existing logic for determining flapping ...) {
set_new_comment_ids();
... create new comment about the flapping ...
set_reuse_comment_ids();
}
... finish reading and parsing retention.dat ...
set_new_comment_ids();
...

All of this is the "right" way to fix the brokenness in this code, but I
was trying to make my changes as minimal as possible to make it easier
to verify their correctness and therefore more likely that someone would
be able to review and commit them. Furthermore, the changes I've
described above are somewhat architectural in nature, and I don't feel
that I'm an experienced enough Nagios developer to be in a position to
make architectural changes. And finally, I do have a job I'm supposed
to be doing instead of hacking on Nagios :-), and I can't really justify
spending more time on the cleaner implementation described above.
> I guess on the initial load of retention.dat you have to validate that
> the next_comment_id is unique,
No, you don't, really. Retention.dat is not supposed to be edited by
the user. It should be safe to assume that when it was saved, the
comment IDs saved to it were unique. If someone does need to edit it
for some reason, then it is their responsibility to preserve the
uniqueness of comment IDs when doing so. Considering how large
retention.dat can get (ours is 27MB) and how expensive it is to read and
parse it, this is a reasonable optimization which decreases the time
taken to read and process the file.
> Can you extend the tests in t-tap/ to time the performance increase?
>
> So I'm thinking that the patch needs a bit more work to be applied.
You may choose to accept the patch as is or not, but I will point out
that although I clearly agree with you that there's a better way to
optimize this code, the changes I made certainly do not make the code
any worse than is now, and I would therefore argue that my changes are a
net gain regardless of whether the other changes described above are
implemented.

As for the testing, my goal here was to solve a specific problem for my
organization, and to contribute the work I did on that solution (finding
the bottleneck and optimizing it) back to the Nagios development team.
As noted above, this is not really how I'm supposed to be spending my
time at work, so I'd prefer if someone else added the test case.

Thanks,

jik

--
Jonathan Kamens
Operations Manager

Advent Tamale RMS <http://www.advent.com/solutions/by-product/tamale-rms>

201 South Street, Suite 300, Boston, MA 02111
Phone: +1 617 261 0264 ext. 133 | Mobile : +1 617 417 8989 |
Fax: + 1 617 812 0330
jkamens at advent.com <mailto:jkamens at advent.com> | www.advent.com
<http://www.advent.com/>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20100128/5654ab6d/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: tamale.jpg
Type: image/jpeg
Size: 6255 bytes
Desc: not available
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20100128/5654ab6d/attachment.jpg>
-------------- next part --------------
------------------------------------------------------------------------------
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
-------------- next part --------------
_______________________________________________
Nagios-devel mailing list
Nagios-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-devel

Previous message: More performance optimizations for 3.2.0, this time optimizing startup time
Next message: Minor documentation issue
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Developers mailing list