Uptime Calculation Question

Breandan Dezendorf breandan at dezendorf.com
Thu Feb 10 22:37:12 CET 2011


Does anyone have a good guide to the impact the check_interval setting
has on calculating uptime and availability data from Nagios logs?

For example, if your check_interval is set to 10 minutes, a service
could be down for 9 minutes and never register in Nagios.  However,
your availability numbers at that point couldn't be any more precise
than 99.99% (as the cutoff for "five nines" is 5.26 service outage
minutes a year).  While unlikely, six such outages would push you into
99.9% - and an SLA report that generated from Nagios log files would
still report 100%.  If the value for check_interval is set to 30
minutes, the problem is amplified - Nagios is more likely to miss
events, which makes me even less comfortable with the resulting
statistics.

Are there SLA packages for Nagios that account for this, or does
Nagios's in-built reporting engine account for this in some way?  Or,
is there a statistician amongst us who can make me understand that I'm
just being overly paranoid, and show me that the math actually works
out?

-- 
Breandan Dezendorf
breandan at dezendorf.com
bwdezend at gmail.com

------------------------------------------------------------------------------
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list