improving the 300 second resolution nagiosgraph

Marc Powell marc at ena.com
Tue Jan 5 16:57:19 CET 2010


On Jan 5, 2010, at 9:09 AM, Litwin, Matthew wrote:

> Thank you very much. This correlates with the results I am seeing. Normally this behavior is something that is beneficial for any sort of floating point arithmetic but can easily produce confusing results for monitors with (1) integer values and (2) have checks more frequent that the step or "bucket" duration.

Integer values are a different situation as well ;) In RRDTool, *all* values are treated as rates of change, even GAUGES. RRDTool also pre-creates all the buckets that the values will fall into at very specific timestamps (multiples of step from the rrd start time or epoch if start isn't specified). If you insert your values at the exact time that corresponds to a bucket, rrdtool uses that value as is and it looks like your value was treated as a integer. If you insert at a time that is slightly before or slightly after a bucket, rrdtool will adjust your value based on the rate of change from the last insert and the exact time of the current insert. It essentially 'fudges' the value to make it fit in the correct bucket as if it were a rate.

> From your explanation this is what I can surmise about my current case. A monitor is checked every 60 seconds so there are going to be 5 readings per step that would be averaged. This monitor usually flat at "0" and occasionally blips into "1". Thus, depending on how many readings per step are at "1" and how many are at "0", my RRD data at each step will predictably be 0.2 for 1 reading at "1", 0.4 for 2 readings at "1" and so forth. Do I have that right?

Mostly. There's some additional fudging that goes on when you try to insert multiple values between steps, with some values being ignored I believe. I haven't cared enough about it to dig into the very specifics of it personally.

> That said, it sounds like I might want to make a plea to the author of nagiosgraph to make the step length something that can be configureable, but I might just be hitting the wall of what this tool can actually do.

Yes, they would need to add that support. It will introduce some end-user complexity in cases where checks happen at different intervals but playing around with heartbeat to cover the longest case would probably work.

--
Marc
------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list