Nagios SLAs .... [SEC=UNCLASSIFIED]
Stanley.Hopcroft at Dest.gov.au
Stanley.Hopcroft at Dest.gov.au
Fri Aug 17 07:36:28 CEST 2007
Dear Folks,
Is anyone interested in a toy like this (Nagios::SLA running here on the
CLI)
foo$ host_down_report -t thismonth | perl -MNagios::SLA -lane 'BEGIN {
$x=Nagios::SLA->new(undef, undef, "24x7", undef, "Nagios") } next unless
/\d{2}-\d{2}/; print $_, "\t", $x->sla_down("$F[1] $F[2]", "$F[3]
$F[4]")'
Broken_Hill_pe 02-08-2007 01:12:33 02-08-2007
01:18:33 6m 0s 360
Wollongong 02-08-2007 15:02:03 02-08-2007
15:08:33 6m 30s 360
Orange 08-08-2007 13:21:05 08-08-2007
13:26:15 5m 10s 300
Sydney_pe 13-08-2007 07:31:51 13-08-2007
09:00:11 1h 28m 20s 5340
Darwin-backup_pe 14-08-2007 16:32:01 14-08-2007
17:45:11 1h 13m 10s 4380
Hobart_Harrington_St 15-08-2007 12:28:40 15-08-2007
16:00:10 3h 31m 30s 12720
foo$ host_down_report -t thismonth | perl -MNagios::SLA -lane 'BEGIN {
$x=Nagios::SLA->new(undef, undef, undef, undef, "Nagios") } next unless
/\d{2}-\d{2}/; print $_, "\t", $x->sla_down("$F[1] $F[2]", "$F[3]
$F[4]")'
Broken_Hill_pe 02-08-2007 01:12:33 02-08-2007
01:18:33 6m 0s 0
Wollongong 02-08-2007 15:02:03 02-08-2007
15:08:33 6m 30s 360
Orange 08-08-2007 13:21:05 08-08-2007
13:26:15 5m 10s 300
Sydney_pe 13-08-2007 07:31:51 13-08-2007
09:00:11 1h 28m 20s 3600
Darwin-backup_pe 14-08-2007 16:32:01 14-08-2007
17:45:11 1h 13m 10s 4380
Hobart_Harrington_St 15-08-2007 12:28:40 15-08-2007
16:00:10 3h 31m 30s 12720
host_down_report is a small public application (see Nagios::Report) that
outputs the host availability report on the CLI.
Nagios::SLA is an unpublished Perl module that computes the amount of
down time in an outage according to an SLA.
In the first case, the SLA is called "24x7" so the SLA outage (last
column) is the outage interval in seconds (3 hours, 31 mins and 30
seconds should be the same as 12, 720 seconds).
In the second case, the SLA is for a default SLA of Mon to Fri 8 am to 6
pm. So the outage of 6 mins between 01:12:33 and 01:18:33 on Thur 2 Aug
2007 (Au has Euro style dates) contributes 0 seconds of downtime to the
SLA.
The method sla_down() should take a pair of time stamps representing the
outage interval (ie DOWN, UP) and return the number of seconds the
outage overlapped the SLA.
Computation is done by bit maps encoding the SLAs (ie 3 bytes/day in a
monthly SLA) and the hour part of the outage. The sla_down() method
supports MySQL and Nagios time stamps and possibly others.
sla_down() does not work with outages that span months (eg 1 minute to
midnight until say midday the following workday morning which happens to
be the first of the following month). The dumb workaround with this
would be to split the outage into as many months as the outage spans. On
the other hand, if you have an outage like this perhaps the focus should
be somewhere else.
This behaviour of sla_down() is because the SLA that is constructed
meets my requirements of monthly reports. There are no plans to change
this, although if you report on smaller intervals eg weekly, the SLA
computation should be Ok. You simply construct the standard monthly SLA
and only compute on the part of the month that you want to report on.
I am using it to report availability against an SLA of Mon-Fri 8 am to 6
pm (Mon-Fri 8-18). In our case, the outages are stored as rows in a
MySQL table. The report is constructed by iterating over the rows and
computing the SLA outage for that row.
I would say this is not a particuarly good module as far as Perl modules
go (where an alpha module would be WWW::Mechanize or LWP)but it may be
helpful, or at least annoy an alpha developer enough to do it properly.
If there is any interest, you will find it in the usual locations.
However, given the problems with ePN and version 3.x, there may be no
time for responding to bugs.
Yours sincerely.
Classification: UNCLASSIFIED
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list