check_openmanage weirdness
Greg Etling
getling at stern.nyu.edu
Wed May 19 21:50:17 CEST 2010
I have just started implementing some check_openmanage checks on my
servers, and have run into some odd behavior with the combination of
Windows 2003, OM 6.2 and the SNMP check. It appears that this
combination is having issues with the drive/controller reporting.
Initially things worked fine under OM 5.4, until the SNMP service would
die (other than that, Mrs. Lincoln...) - so i upgraded to OM 6.2, when I
observed the following behaviour.
When the check is run without any blacklisting, the plugin reports that
there is a global status WARNING, but all components are OK - the
WARNING is coming from out of date Firmware/Driver versions as listed below:
------
Firmware/Driver Information for Controller PERC 6/i Integrated
Firmware Version 6.0.3-0002
Minimum Required Firmware Version 6.2.0-0012
Driver Version 2.14.00.32
Minimum Required Driver Version 2.23.00.32
Storport Driver Version 5.2.3790.3959
Minimum Required Storport Driver Version 5.2.3790.4173
------
Now when run in debug mode, I noticed that it had no information about
the drives at all (note the beta version - same output as plugin v3.5.7):
------
[root at sys-mgt-1 stern]# ./check_openmanage -H testserver -C *****
System: PowerEdge 2950
ServiceTag: XXXXXXX OMSA version: 6.2.0
BIOS/date: 2.3.1 04/29/2008 Plugin version: 3.5.8-beta7
-----------------------------------------------------------------------------
Chassis Components
=============================================================================
STATE | ID | MESSAGE TEXT
---------+------+------------------------------------------------------------
OK | 1 | Memory module 1 [DIMM1, 2048 MB] is Ok
OK | 2 | Memory module 2 [DIMM2, 2048 MB] is Ok
OK | 3 | Memory module 3 [DIMM3, 2048 MB] is Ok
OK | 4 | Memory module 4 [DIMM4, 2048 MB] is Ok
OK | 1 | Chassis fan 1 [System Board FAN 1 RPM]: 7050
OK | 2 | Chassis fan 2 [System Board FAN 2 RPM]: 7125
OK | 3 | Chassis fan 3 [System Board FAN 3 RPM]: 7125
OK | 4 | Chassis fan 4 [System Board FAN 4 RPM]: 7050
OK | 0 | Power Supply 0 [AC]: Presence detected
OK | 1 | Power Supply 1 [AC]: Presence detected
OK | 0 | Temperature Probe 0 [System Board Ambient Temp] reads
22 C (min=8/3, max=42/47)
OK | 0 | Processor 0 [Intel Xeon E5440 2.83GHz] is Present
OK | 1 | Processor 1 [Intel Xeon E5440 2.83GHz] is Present
OK | 0 | Voltage sensor 0 [CPU1 VCORE] is Good
OK | 1 | Voltage sensor 1 [CPU2 VCORE] is Good
OK | 2 | Voltage sensor 2 [System Board CPU VTT] is Good
OK | 3 | Voltage sensor 3 [System Board 1.5V PG] is Good
OK | 4 | Voltage sensor 4 [System Board 1.8V PG] is Good
OK | 5 | Voltage sensor 5 [System Board 3.3V PG] is Good
OK | 6 | Voltage sensor 6 [System Board 5V PG] is Good
OK | 7 | Voltage sensor 7 [Riser 1.5V PXH PG] is Good
OK | 8 | Voltage sensor 8 [Riser 5V Riser PG] is Good
OK | 9 | Voltage sensor 9 [System Board Backplane PG] is Good
OK | 10 | Voltage sensor 10 [System Board Linear PG] is Good
OK | 11 | Voltage sensor 11 [System Board 0.9V PG] is Good
OK | 12 | Voltage sensor 12 [System Board 0.9V Over Volt] is Good
OK | 13 | Voltage sensor 13 [System Board CPU Power Fault] is Good
OK | 0 | Battery probe 0 [System Board CMOS Battery] is
Presence Detected
OK | 0 | Chassis intrusion 0 detection: Ok (Not Breached)
-----------------------------------------------------------------------------
Other messages
=============================================================================
STATE | MESSAGE TEXT
---------+-------------------------------------------------------------------
OK | ESM log health is Ok (less than 80% full)
OOPS! Something is wrong with this server, but I don't know what. The
global system health status is WARNING, but every component check is OK.
This may be a bug in the Nagios plugin, please file a bug report.
------
And the Status as reported to Nagios believes that there are no disks
whatsoever on the server:
------
OK - System: 'PowerEdge 2950', SN: 'XXXXXXX', hardware working fine, 0
logical drives, 0 physical drives
------
This has been replicated on several identical systems.
I'm a bit stumped as to where the problem lies. Please let me know if
you need further information from me.
Thanks in advance,
Greg
---
Greg Etling
getling at stern.nyu.edu
Systems Administrator
Stern IT Enterprise Operations
NYU Stern School of Business
------------------------------------------------------------------------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list