check_openmanage weirdness

Greg Etling getling at stern.nyu.edu
Wed May 19 21:50:17 CEST 2010


I have just started implementing some check_openmanage checks on my 
servers, and have run into some odd behavior with the combination of 
Windows 2003, OM 6.2 and the SNMP check. It appears that this 
combination is having issues with the drive/controller reporting. 
Initially things worked fine under OM 5.4, until the SNMP service would 
die (other than that, Mrs. Lincoln...) - so i upgraded to OM 6.2, when I 
observed the following behaviour.

When the check is run without any blacklisting, the plugin reports that 
there is a global status WARNING, but all components are OK - the 
WARNING is coming from out of date Firmware/Driver versions as listed below:

------
Firmware/Driver Information for Controller PERC 6/i Integrated
Firmware Version    6.0.3-0002
Minimum Required Firmware Version    6.2.0-0012
Driver Version    2.14.00.32
Minimum Required Driver Version    2.23.00.32
Storport Driver Version    5.2.3790.3959
Minimum Required Storport Driver Version    5.2.3790.4173
------

Now when run in debug mode, I noticed that it had no information about 
the drives at all (note the beta version - same output as plugin v3.5.7):
------
[root at sys-mgt-1 stern]# ./check_openmanage -H testserver -C *****
    System:      PowerEdge 2950
    ServiceTag:  XXXXXXX                  OMSA version:    6.2.0
    BIOS/date:   2.3.1 04/29/2008         Plugin version:  3.5.8-beta7
-----------------------------------------------------------------------------
    Chassis Components
=============================================================================
   STATE  |  ID  |  MESSAGE TEXT
---------+------+------------------------------------------------------------
       OK |    1 | Memory module 1 [DIMM1, 2048 MB] is Ok
       OK |    2 | Memory module 2 [DIMM2, 2048 MB] is Ok
       OK |    3 | Memory module 3 [DIMM3, 2048 MB] is Ok
       OK |    4 | Memory module 4 [DIMM4, 2048 MB] is Ok
       OK |    1 | Chassis fan 1 [System Board FAN 1 RPM]: 7050
       OK |    2 | Chassis fan 2 [System Board FAN 2 RPM]: 7125
       OK |    3 | Chassis fan 3 [System Board FAN 3 RPM]: 7125
       OK |    4 | Chassis fan 4 [System Board FAN 4 RPM]: 7050
       OK |    0 | Power Supply 0 [AC]: Presence detected
       OK |    1 | Power Supply 1 [AC]: Presence detected
       OK |    0 | Temperature Probe 0 [System Board Ambient Temp] reads 
22 C (min=8/3, max=42/47)
       OK |    0 | Processor 0 [Intel Xeon E5440 2.83GHz] is Present
       OK |    1 | Processor 1 [Intel Xeon E5440 2.83GHz] is Present
       OK |    0 | Voltage sensor 0 [CPU1 VCORE] is Good
       OK |    1 | Voltage sensor 1 [CPU2 VCORE] is Good
       OK |    2 | Voltage sensor 2 [System Board CPU VTT] is Good
       OK |    3 | Voltage sensor 3 [System Board 1.5V PG] is Good
       OK |    4 | Voltage sensor 4 [System Board 1.8V PG] is Good
       OK |    5 | Voltage sensor 5 [System Board 3.3V PG] is Good
       OK |    6 | Voltage sensor 6 [System Board 5V PG] is Good
       OK |    7 | Voltage sensor 7 [Riser 1.5V PXH PG] is Good
       OK |    8 | Voltage sensor 8 [Riser 5V Riser PG] is Good
       OK |    9 | Voltage sensor 9 [System Board Backplane PG] is Good
       OK |   10 | Voltage sensor 10 [System Board Linear PG] is Good
       OK |   11 | Voltage sensor 11 [System Board 0.9V PG] is Good
       OK |   12 | Voltage sensor 12 [System Board 0.9V Over Volt] is Good
       OK |   13 | Voltage sensor 13 [System Board CPU Power Fault] is Good
       OK |    0 | Battery probe 0 [System Board CMOS Battery] is 
Presence Detected
       OK |    0 | Chassis intrusion 0 detection: Ok (Not Breached)
-----------------------------------------------------------------------------
    Other messages
=============================================================================
   STATE  |  MESSAGE TEXT
---------+-------------------------------------------------------------------
       OK | ESM log health is Ok (less than 80% full)
OOPS! Something is wrong with this server, but I don't know what. The 
global system health status is WARNING, but every component check is OK. 
This may be a bug in the Nagios plugin, please file a bug report.
------

And the Status as reported to Nagios believes that there are no disks 
whatsoever on the server:
------
OK - System: 'PowerEdge 2950', SN: 'XXXXXXX', hardware working fine, 0 
logical drives, 0 physical drives
------

This has been replicated on several identical systems.

I'm a bit stumped as to where the problem lies. Please let me know if 
you need further information from me.

Thanks in advance,
Greg
---
Greg Etling
getling at stern.nyu.edu
Systems Administrator
Stern IT Enterprise Operations
NYU Stern School of Business


------------------------------------------------------------------------------

_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list