Nagios MTBF MTTR
Martin Melin
mmelin at gmail.com
Wed May 19 20:44:01 CEST 2010
On Wed, May 19, 2010 at 7:41 PM, Christian Iñiguez
<challenger_joseph at yahoo.com.mx> wrote:
> Hi Everyone!
>
> I actually am using Nagios 3.2.0, and it has been very useful for us, but recently I was told that we need to implement the measures MTBF (Mean Time Between Failures) and MTTR (Mean Time To Repair) but do not know how.
>
> Could you help me if there is any tool or script or kind of report of Nagios (or based on) to get this measures? Does anybody have implemented this in Nagios?
>
> Any help would be very useful to me. I hope you guys can help me.
>
> Thanks in advance!!
I don't know of a released script that does this, but all the data is
available from core Nagios. MTBF is just (total uptime - total
downtime) / number of failures, and as long as you don't have any
definition of a fault being repaired other than "nagios records
recovery" MTTR is simply a matter of averaging the time between hard
DOWN and UP state changes.
For MTBF the avail.cgi output along with a count of recoveries should
IMHO get you a long way to this goal. MTTR requires some more logic
but should be relatively simple.
If there's sufficient interest I could probably try hacking up a draft
of this, but I'd like to see if anyone else on the list has a better
idea or working code first.
Best regards,
Martin Melin
------------------------------------------------------------------------------
More information about the Developers
mailing list