host_name and service_description examples

Andreas Ericsson ae at op5.se
Sat Jun 11 03:48:04 CEST 2005


Thank you everone who has sent in testing data. I could still use more 
(hostnames in particular), but I've had enough to make a fair few 
conclusions that will help boost performance in Nagios as soon as I've 
had time to put a patch together and test it rigorously.

Those of you who are interested in the testing code and the various 
hashes I've considered for use in nagios can have a look at 
http://oss.op5.se/nagios/nagios-hash.tar.gz

Note that this is intended for developers only, and that it won't be 
useful for those who aren't, so you non-nerds might want to stop reading 
now. ;)


Worth noting with these hashes are that they are all very simple and 
that computing a rough 10 000 (or even 4 200 000 input lines) doesn't 
take very long (at least on my laptop). It's locating objects in 
over-filled buckets that's slow.


Compile as such;
make
(it will complain about initialization from incompatible pointer types, 
but that's ok).

Run as such;
./hash hosts

You'll see output looking something like this (actual test-data, but 
truncated. The real one tests eight hashes and with lots of different 
bucket-table sizes);

*************************************************
Slots: 32768, bucket table size: 128KB
Strings: 10593. Min collisions 0.

Additive (current nagios) hash
Hash score: 75974.443793
Collisions: 8816, 83.224771%. Depth max: 75, average: 8.617791.
High mark: 3863, low mark: 466, range: 3397

Fowler/Noll/Vo hash (ham_func5)
Hash score: 1509.620363
Collisions: 1383, 13.055792%. Depth max: 4, average: 1.091555.
High mark: 4294646454, low mark: 624407, range: 4294022047

**************************************************

Slots is the number of slots (buckets) in the current setup. bucket 
table size is the amount of memory consumed by the current scheme.
Strings is the number of input strings, and min collisions is either 
(strings - slots), or 0.

The first line each hash-func entry is just a marker of which hash is 
currently being tested.

Hash score is calculated by multiplying the average collision depth with 
the total amount of collisions. The lower the score, the better the hash.

Collisions is ofcourse the number of collisions that occurred with the 
current hashing scheme. The % value is the odds of it happening using 
the current input and bucket table size.

Depth is the number of linked-list items one would have to traverse in 
order to reach the object one aims for (bucket-width). If max and 
average differs a lot the hashfunction doesn't spread objects evenly 
across the buckets.

High/Low mark and range is the maximum and minimum hash-values (32-bit) 
returned by the current hash-function. Range is really the only valuable 
information here. If range is less than the number of slots (buckets), 
it's impossible to trade memory for speed. A high range value is 
therefore desirable.


Most of the hashes in the test comes from
http://burtleburtle.net/bob/hash/ (a truly horrible page, design-wise, 
but very informative). Some others come from the MySQL sources (although 
originally from the IEEE Posix P1003.2 mailing list). The additive hash 
is my own implementation. It matches that currently in use in nagios.

Andreas Ericsson wrote:
> Ahoy all, and sorry for cross-posting.
> 
> I'm working on improving the hash-functions in Nagios in an effort to 
> boost performance. To do that, I need some real-world example input of 
> host_name and service_description variables to get accurate timing 
> results of the various hash-functions I'm considering.
> 
> Please understand that I'll be posting the test-data variables along 
> with the example timing code, so if your variables of that kind contain 
> sensitive information you shouldn't send it.
> 
> You can use the two commands below to extract host_names and 
> service_descriptions from your configuration.
> 
> Note that <cfg-files> should be replaced with something like 
> /usr/local/nagios/etc/*.cfg on a default installation. Mind the 
> line-breaks. Both sed-commands should be on a single line.
> 
> sed -n '/^[^#].*host_name/{s/.*host_name[\t ]*\([^\t ]*\)/\1/;p}' 
> <cfg-files> | sort | uniq > host_name_vars
> 
> sed -n '/^[^#].*service_description/{s/.*service_description[\t ]*\([^\t 
> ]*\)/\1/;p}' <cfg-files> | sort | uniq > service_description_vars
> 
> Please compress the files host_name_vars and service_description_vars 
> prior to sending it to me. bzip2 does the best job with text-files.
> 
> 
> Thanks for helping out.
> 

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Lead Developer


-------------------------------------------------------
This SF.Net email is sponsored by: NEC IT Guy Games.  How far can you shotput
a projector? How fast can you ride your desk chair down the office luge track?
If you want to score the big prize, get to know the little guy.  
Play to win an NEC 61" plasma display: http://www.necitguy.com/?r=20




More information about the Developers mailing list