notification_interval seems to be ignored
Zembower, Kevin
kzembowe at jhuccp.org
Thu Nov 15 17:53:36 CET 2007
I've written a custom plug-in to monitor the ambient temperature probe
on my Dell PowerEdge server. It's a wrapper around the standard
check_smtp plugin. It's been working correctly for months now. I pasted
the code for this in at the end of this message.
Yesterday at 2:00 the compressor failed in my server room and the
temperature went up to 90F. My notification was correctly sent, and I
was happy. However, I've received a notification every 5 minutes since
then that the temperature is over 80.
In the service definition, I have notification_intervals set to 0:
# check that ambient temperature from Dell sensor is less than 80 and 90
degree F.
# Notify the 'temp' group only.
define service {
hostgroup_name temp_sensor
service_description Ambient Temperature
check_command check_ambtempF!80!90
#For testing, set the temperature too low
# check_command check_ambtempF!60!70
use generic-service
notification_interval 0; set > 0 if you want to be
renotified
contact_groups temp
}
In generic-service.nagios2.cnf, the notification_interval is also set to
0. Furthermore, I don't have any escalations defined.
And, I just discovered that even though I disabled notifications for
this service using the nagios2 GUI, I'm still getting notified every
five minutes.
Can anyone suggest anything I can try to fix this behavior? Did I
overlook something in how I wrote the plugin?
My system is Nagios 2.6, as installed by the Debian 3.0 package system.
Thanks for any advice or suggestions.
-Kevin
Kevin Zembower
Internet Services Group manager
Center for Communication Programs
Bloomberg School of Public Health
Johns Hopkins University
111 Market Place, Suite 310
Baltimore, Maryland 21202
410-659-6139
========================================================
nagios at cn2://etc/nagios2/conf.d$ cat
/usr/lib/nagios/plugins/check_ambtempF
#! /usr/bin/perl -w
# check_ambtempF is a perl wrapper around the Nagios check_snmp plugin
# to check the ambient temperature sensor in Dell PowerEdge servers.
# Written by Kevin Zembower, 13-Sep-2007
use strict;
use Getopt::Std;
my %opts;
getopts('dhc:w:H:',\%opts);
# Use this line below to dump the environment variables for debugging
system "env|sort >/tmp/plugins_env.$$" if defined($opts{d});
my %NAGIOS_ENV = map { $_ => $ENV{$_} } grep /^NAGIOS_/, keys %ENV;
if ( defined($opts{h}) ) {
print <<EOF;
Test ambient temperaure in Farenheit plugin for Nagios
Copyright (c) 2007 Kevin Zembower
This plugin is used to check the ambient temperature in degrees
Farenheit on Dell PowerEdge servers using SNMP.
Requirements:
This plugin requires /usr/lib/nagios/plugins/check_snmp.
usage: $0 [-h] [-d] [-H hostaddress] [-w warn] [-c crit]
-h print this short help message
hostaddress address of host to check
warn Warning threshold value in degrees Farenheit
crit Critical threshold value in degrees Farenheit
-d Turn on debugging output
EOF
exit;
}
# for debugging
open(DMP, ">/tmp/temperature.dmp") if defined($opts{d});
my $warn= $opts{w};
my $crit= $opts{c};
my $debug = $opts{d};
my $hostaddress;
if (defined $opts{H}) {
$hostaddress=$opts{H};
} elsif (defined($NAGIOS_ENV{NAGIOS_HOSTADDRESS})) {
$hostaddress = $NAGIOS_ENV{NAGIOS_HOSTADDRESS};
} else {
$hostaddress="127.0.0.1"
};
print DMP "hostaddress is $hostaddress.\n" if defined($opts{d});
my $output = "Temperature ";
$_ = `/usr/lib/nagios/plugins/check_snmp -H $hostaddress -o
.1.3.6.1.4.1.674.10892.1.700.20.1.6.1.3`;
print DMP $_ if defined($opts{d});
close(DMP) if defined($opts{d});
if ($? != 0) { #There was an error calling the check_snmp routine...
$output .= "UNKNOWN: CCP server room temperature could not be
determined with host $hostaddress. Probable communications or host
failure.\n";
print $output;
exit 3;
}
print "Error code: $?\n" if $debug;
print $_ if $debug;
(my $tempC) = /=(\d+)/; #All the digits after the equals sign are the
temperature in tenths of a degree Celsius
$tempC /= 10; #Divide the returned value by 10
print "${tempC}C\n" if $debug;
my $tempF = (9/5*$tempC + 32);
print "${tempF}F\n" if $debug;
if ( defined $crit && $tempF >= $crit ) {
$output .= "CRITICAL: CCP server room temperature of ${tempF}F
exceeds critical temperature of $crit\n";
$output .= "Probable air conditioning failure.\n";
print $output;
exit 2;
} elsif ( defined $warn && $tempF >= $warn ) {
$output .= "WARNING: CCP server room temperature of ${tempF}F
exceeds warning temperature of $warn\n";
$output .= "Probable air conditioning failure.\n";
print $output;
exit 1;
} else {
$output .= "OK: CCP server room temperature is ${tempF}F\n";
print $output;
exit 0;
}
nagios at cn2://etc/nagios2/conf.d$
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list