check_snmp CPU Load strange result
Pascal Wessel
pascal.wessel at media-online.ch
Mon Dec 2 18:52:23 CET 2002
----Original Message-----
From: Pascal Miquet [mailto:p.miquet at hafiba.fr]
Sent: lundi, 2. décembre 2002 18:29
To: Pascal Wessel
Subject: Re: [Nagios-users] check_snmp CPU Load strange result
Sorry but I've got no answer for You, Just some questions according to
the Cisco 3640 Checks.
On which system did you have you nagios service ? Linux ?
YES on Linux : Mandrake 9.0, Kernel 2.4.19-16
And if yes how did you set the snmp service checks ? What added software
is needed on your server ?
All ucd-snmp stuff : SNMP:
libsnmp0-4.2.3-4mdk
ucd-snmp-4.2.3-4mdk
ucd-snmp-utils-4.2.3-4mdk
And how can we get informations according to the snmp services available
on a 3640 router ?
By queries against the MIB (OID or plain text if you have the
Cisco-MIBS on your Linux box in /usr/share/snmp/mibs/ )
To test your snmp installation (UCD-SNMP, not Nagios scripts) just
issue the following (logged on as user nagios):
snmpwalk myroutername myROcommunity system
where myroutername is the DNS FQDN router name, or use the
management ip address
where myROcommunity is the SNMP Read-Only comunity (very often
set with the default string: public)
where system is the start of the system MIB tree (to have the
full picture don't even use system, just do:
snmpwalk myrouter myROcummunity | more
Thanks for your help
You are welcome !
Regards
Pascal Miquet
Le lun 02/12/2002 à 15:36, Pascal Wessel a écrit :
Nagios gives me warning when snmp_check 'ing for Cisco 3640 CPU load /
IOS is (C3640-IK9O3S-M), Version 12.2(10a) but the CPU load is below my
Warning threshold.
When launched from the command-line with verbose output:
[libexec]# ./check_snmp -v -t 10 -H 192.168.1.1 -o
.1.3.6.1.4.1.9.2.1.57.0,.1.3.6.1.4.1.9.2.1.58.0 -C publicro -w '60,69',
-c
'70,80' -l 'CPU usage 1min/5min' -D ' / '
/usr/bin/snmpget -m ALL -v 1 -c publicro 192.168.1.1:161
.1.3.6.1.4.1.9.2.1.57.0 .1.3.6.1.4.1.9.2.1.58.0
enterprises.9.2.1.57.0 = 4
enterprises.9.2.1.58.0 = 3
CPU usage 1min/5min WARNING - *4* / *3*
As you can see.. (and if I understood the syntax)
Warning status should be triggered when the CPU load is between 60 and
69%
Critical status should be triggered when the router CPU is between 70 to
80%
#----
My question is: why this check reports WARNING as my router CPU load (4%
last minute and 3% last 5 min) is below the WARNING threshold ?
#----
My Nagios system installation is as follows:
System Intel i686, Mandrake 9.0, Kernel 2.4.19-16
NAGIOS: Nagios 1.0b6
Plugins: nagios-plugins-200211131100
Check_snmp: Revision: 1.17
SNMP:
libsnmp0-4.2.3-4mdk
ucd-snmp-4.2.3-4mdk
ucd-snmp-utils-4.2.3-4mdk
Below a snip of my "cfg file
#--- hosts.cfg for myrouter
define host {
name generic-host
notifications_enabled 1 ; Host notifications
are enabled
event_handler_enabled 1 ; Host event handler is
enabled
flap_detection_enabled 1 ; Flap detection is
enabled
process_perf_data 1 ; Process performance
data
retain_status_information 1 ; Retain status
information across program restarts
retain_nonstatus_information 1 ; Retain non-status
information across program restarts
max_check_attempts 10
register 0 ; DONT REGISTER THIS
DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
}
define host {
use generic-host ; Name
of host template to use
host_name myrouter
alias Router Gva Coulou -6
address 192.168.1.1
check_command check-host-alive
notification_interval 60
notification_period 24x7
notification_options d,u,r
}
#--- services.cfg
define service {
name generic-service ;
active_checks_enabled 1 ; Active service checks are
enabled
passive_checks_enabled 1 ; Passive service checks are
enabled/accepted
parallelize_check 1 ; Active service checks should
be parallelized
obsess_over_service 1 ; We should obsess over this
service (if necessary)
check_freshness 0 ; Default is to NOT check
service 'freshness'
notifications_enabled 1 ; Service notifications are
enabled
event_handler_enabled 1 ; Service event handler is
enabled
flap_detection_enabled 1 ; Flap detection is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information
across program restarts
retain_nonstatus_information 1 ; Retain non-status information
across program restarts
normal_check_interval 5
retry_check_interval 2
notification_period 24x7
notification_options u,c,r
register 0 ; DONT REGISTER THIS
DEFINITION
}
define service{
use generic-service
host_name myrouter
service_description CPU
is_volatile 0
check_period 24x7
max_check_attempts 3
retry_check_interval 1
contact_groups router-admins
notification_interval 120
notification_period 24x7
check_command
check_cisco_cpu!publicro!60!69!70!80
}
#--- checkcommands.cfg
# 'check_snmp' generic command definition
define command{
command_name check_snmp
command_line $USER1$/check_snmp -t 10 -H $HOSTADDRESS$ -C $ARG1$
$ARG2$ $ARG3$ $ARG4$ $ARG5$ $ARG6$ $ARG7$ $ARG8$ $ARG9$
}
# check_cisco_cpu: checks router CPU-usage
# Syntax
!Hostname!Community!WARN-1min-%!WARN-5min-%!CRIT-1min-%!CRIT-5min-%
define command{
command_name check_cisco_cpu
command_line $USER1$/check_snmp -t 10 -H $HOSTADDRESS$
-o.1.3.6.1.4.1.9.2.1.57.0,.1.3.6.1.4.1.9.2.1.58.0 -C $ARG1$ -w
:$ARG2$,:$ARG3$ -c :
$ARG4$,:$ARG5$ -l 'CPU usage 1min/5min' -D ' / '
}
Btw, by looking at the code in check_snmp.c I'm wondering .
Is there a problem with : #define mark(a) ((a)!=0?"*":"") in
check_snmp.c ??? Or are my parms so bad ? :-o
Thanks for your kind help.
Warm regards,
Pascal
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
<https://lists.sourceforge.net/lists/listinfo/nagios-users>
https://lists.sourceforge.net/lists/listinfo/nagios-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20021202/c4cdf5b5/attachment.html>
More information about the Users
mailing list