Nagios stress-test
Andreas Ericsson
ae at op5.se
Tue Apr 5 09:37:33 CEST 2005
Ahoy all.
This mail is intended for those of you interested in contributing to
Nagios but aren't very comfortable with threadsafe C-programming. Others
might want to skip this mail.
I've created a small but significantly weird plugin called check_rand,
available for download at http://oss.op5.se/nagios and
https://devel.op5.se/oss
check_rand will;
* exit with a properly pseudo-random exit code between 0 and 3. Each
code is tested to have equal value.
* print a random message 50% of the times, and the message "Life, loathe
it or ignore it, you can't like it" (which happened to be the first
message that the fortune program spit out).
* print perfdata 50% of the times.
* print empty perfdata 25% of the times (half the times it prints perfdata).
* *possibly* time out 12.5% of the times its run (the value passed to
sleep is random, so it might not time out after all, but sleep only has
a 12.5 chance of being called).
* sleep up to 12 seconds prior to exiting, after having printed output.
This will let it sometimes time out AFTER having printed output, which
might be a source of crashing.
To implement it for every check available, as well as notification
commands (you won't want the barrage of notifications this script will
generate), you should run something like this;
cp misccommands.cfg misccommands.cfg.bak
cp checkcommands.cfg checkcommands.cfg.bak
sed 's,\(command_line[^/]*\)[^ ]*\(.*\),\1/check_rand\2,'
checkcommands.cfg.bak >> checkcommands.cfg
sed 's,\(command_line[^/]*\)[^ ]*\(.*\),\1/check_rand\2,'
misccommands.cfg.bak >> misccommands.cfg
(make sure you get those sed-lines right. Cut'n'past is your friend.)
Or you can manually change the command actually run to check_rand (or
symlink every plugin you have to check_rand, or something else that will
assure that check_rand is run instead of the actual plugin). This kind
of stress-testing is fairly important if we want 2.x to go stable
sometime soon, so don't be afraid to ask if you're having trouble.
Those easily offended should take heed, as I included fortune's
offensive database. I just needed a lot of C-style strings pronto and
took the ones that were readily available.
If you're serious about helping out debugging nagios you should run an
un-stripped version (file /usr/local/nagios/bin/nagios will tell you) so
that core-dumps are made useful and have daemon_dumps_core set to 1 in
your nagios.cfg. It's very important that you keep the core files and
the nagios .log and .sav-files that were generated during the (possible)
crash, as debugging without them is simply hell.
If you don't like reporting things to the nagios-devel mailinglist you
can send bug-reports privately to me and I'll collect and forward them.
It's appreciated all the same, and results should be visible in the form
of commits to CVS and more stable code.
The check_rand plugin is written in ANSI C for portability but uses
/dev/urandom as its source for randomness. If this is a showstopper,
then let me know and I'll work around it.
Cheers, and thanks for listening and (possibly) contributing.
--
Andreas Ericsson andreas.ericsson at op5.se
OP5 AB www.op5.se
Lead Developer
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list