Windows disk health monitoring with smartmontoolsl/NSClient++?
Eric Pearce
epearce at amberpoint.com
Thu Jan 22 23:26:59 CET 2009
I've hacked something together that seems to work using WSH and WMI (no smartmontools).
It displays the following for the "Service State Information":
Current Status: OK (for 0d 0h 14m 52s)
Status Information:SMART Status is OK
Performance Data:WDC WD1500HLFS-01G6U0 139 GB
In "nsc.ini" on the client, I've made the following changes:
uncommented NRPEListener.dll
added to [External Scripts]
check_smart_disk0=cscript.exe //T:30 //NoLogo "C:\Program Files\NSClient++\scripts\smart.vbs" 0
check_smart_disk1=cscript.exe //T:30 //NoLogo "C:\Program Files\NSClient++\scripts\smart.vbs" 1
The file "smart.vbs" contains:
set args = wscript.arguments
drive = Cint(args(0))
strComputer = "."
Set objWMIService = GetObject("winmgmts:" _
& "{impersonationLevel=impersonate}!\\" & strComputer _
& "\root\cimv2")
Set diskset = objWMIService.ExecQuery _
("Select * from Win32_DiskDrive")
For Each disk in diskset
If disk.index = drive Then
Select Case Disk.Status
Case "OK"
WScript.Echo "SMART Status is OK| " & Disk.Model & " " & Int(Disk.Size/1073741824) & " GB"
WScript.Quit(0)
Case Else
Wscript.Echo "SMART Status is " & Disk.Status
Wscript.quit(1)
End Select
End If
next
There are actually a bunch of different error states, but I figure I want to know about anything other than "OK". I kept the level as "WARNING", as I don't know if it's going to be useful until I get more experience with the real-life disk error messages. I'm aware that disks sometimes die with no warning from SMART. I've never used visual basic before, so feel free to improve on this. I just cobbled together little snippets of code I found via google.
On the Nagios server side, the service and hostgroup definitions look like the following:
define service{
use generic-service
hostgroup_name check_smart_disk0
service_description SMART Disk 0
check_command check_nrpe!check_smart_disk0
check_interval 720
}
define service{
use generic-service
hostgroup_name check_smart_disk1
service_description SMART Disk 1
check_command check_nrpe!check_smart_disk1
check_interval 720
}
define hostgroup{
hostgroup_name check_smart_disk0
alias Windows SMART Disk0 status
members host1, host2, host3
}
define hostgroup{
hostgroup_name check_smart_disk1
alias Windows SMART Disk1 status
members host2
}
I do have to know ahead of time the number of disks to check on the client. Seems to be working so far.
-e
----- Original Message -----
From: Eric Pearce
To: Anthony Montibello
Cc: nagios-users at lists.sourceforge.net
Sent: Thursday, January 15, 2009 3:14 PM
Subject: Re: [Nagios-users] Windows disk health monitoring with smartmontoolsl/NSClient++?
Thanks for the tip - I think I'm making some progress, i.e.
C:\Program Files\NSClient++>"nsclient++.exe" CheckWMI Select Status from Win32_DiskDrive
\NSClient++.cpp(370) Attempting to start NSCLient++ - 0.3.5.2 2008-09-24
l \NSClient++.cpp(476) NSCLient++ - 0.3.5.2 2008-09-24 Started!
l \CheckWMI.cpp(306) |--------+
l \CheckWMI.cpp(307) | Status |
l \CheckWMI.cpp(308) |--------+
l \CheckWMI.cpp(317) | OK |
l \CheckWMI.cpp(319) |--------+
l \NSClient++.cpp(530) Attempting to stop NSCLient++ - 0.3.5.2 2008-09-24
l \NSClient++.cpp(589) NSCLient++ - 0.3.5.2 2008-09-24 Stopped succcessfully
But I dont' see how to turn this output into something useful for Nagios, i.e. "OK", "WARNING", "CRITICAL". It appears that the possible return values for "Status" are one of the following: OK,Error,Degraded,Unknown,Pred Fail, Starting, Stopping, Service, Stressed, NonRecover, No Contact or Lost Comm. I would be happy with "OK" resulting in a Nagios "OK" and anything else being a "WARNING". Ideally, "WARNING" followed by the "Status" output from WMI. Is there a way to do this using the NSClient "filter" and Max/Min syntax?
Bonus question: What do you do if you have multiple drives? I don't see any obvious way to specify a drive to check.
Thanks
-e
----- Original Message -----
From: Anthony Montibello
To: Eric Pearce
Cc: nagios-users at lists.sourceforge.net
Sent: Wednesday, January 14, 2009 8:58 PM
Subject: Re: [Nagios-users] Windows disk health monitoring with smartmontoolsl/NSClient++?
USe WMI:
the path to the smart data:
root/Cimv2/Win32_DiskDrive/
[Instance] --> Status
Hope this helps
Tony (Author of NC_Net)
On Tue, Jan 13, 2009 at 10:49 PM, Eric Pearce <epearce at amberpoint.com> wrote:
I'd like to get SMART disk health status for Windows machines. It looks like smartctl would work fine on Windows - has someone got it working with NSClient++?
I've found some people asking about this in the list archives, but haven't found any concrete examples.
All I'm looking for is a basic "OK" or "something bad is going to happen soon" alert from Nagios.
Thanks
-e
------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20090122/384aacef/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list