Discussion of future of Nagios NT client agents and their functions (NSClient and NRPE_Win) - long
Tim Shouldice
tim at mintoskatingclub.com
Thu May 15 16:36:37 CEST 2003
All:
Development of a second client for Windows (nrpe_win) has been started by Michael
Wirtgen.
This client will have much of the nrpe functionality that I had proposed for NSClient
3.0. Specifically, the use of the check_nrpe client, the use of ssl, use of responding to
only pre-specified ip addresses and having nrpe_win functioning through executing plugins
on the Windows server.
The plugins can be written in any language - perl, phython, wsh, vb, c, delphi, whatever.
With the increasing emphasis by Microsoft on exposing easy to use interfaces to system
configuration and performance data (WMI), I see a very bright future for this approach.
A plugin however has some limitations, it is executed, collects its data, determines the
state and returns the state and a text string. This is good for many things (getting CPU
percentages, etc.), however is not good where the monitoring process needs to be on-going.
On-going monitoring is needed (or easier) for some types of checks. Anything which
submits a passive check usually does so by a continually running process. A good example
would be monitoring the NT Event Log and sending a passive check when certain criteria
are met. For performance reasons, a plugin would have a lot of overhead to perform this
type of check - it would have to maintain a state file, check it on exectuion to
determine the event id of the last event it examined, it would have to open the event log
and scan all events from that event id to the most current one and apply each event to
the alerting criteria. This would take more time than most active checks are permitted
(10-30 seconds) and would be a strain on Nagios to be waiting for the return for these
types of checks if they are being executed against 20-50 Windows servers at once.
A second example of where on-going monitoring is needed is when you are monitoring by
Windows call-back functions. A call back function is where your program registers a
function with Windows that Windows calls when an event has occured. The Service Control
Manager uses call back functions to query a service for its status on startup and
shutdown. This is why wrappers for services are less than ideal. There are a number of
useful things in Windows that can be monitored through call-back functions, specifically
a variety of security events are implemented through callbacks. Some of Microsoft's
applications such as Message Queuing also use callbacks due to the asychronous nature of
the events.
A third example is wait functions. These use a messaging type paradigm where you register
the wait function in one part of your code and then in another part you go into a wait
loop and react when the function returns. Code in this needs to be threaded as this type
of code obviously is blocking by nature. Good things that can be monitored by wait
functions include FindFirstChangeNotification() for changes to files and directories and
FindFirstPrinterChangeNotification() for changes to printers.
I see NSClient evolving to handle these types events. NSClient would also continue to
provide its existing functionality for backwards compatibility.
I see it working as follows:
NSClient is loaded as a service, monitoring only CPU (as currently).
Check_nt requests active_check data and gets the results returned immediately.
A mechanism then needs to be created to request NSClient go into event-based monitoring
mode for things such as Event Log events, directory changes, printer changes, security
changes, etc. I haven't decided on the best mechanism to submit these requests. The
results would be sent to Nagios as passive check results. There could be a second Unix
client that maintains a config file of NT clients and checks. This could be executed as
part of the Nagios startup processes. This at least would keep configuration centralized.
An alternative would be a windows client that populates a config file that NSClient reads
on startup.
Some may wonder - why two services instead of one? Well, with two developers, maintaining
the code requires CVS and release coordination. Secondly, NSClient is currently in Delphi
and short of someone else picking it up and re-writing it in C, it will continue to be in
Delphi where nrpe_win will be in C/C++. Also, there is much to be said for modularity.
BMC Patrol is an industrial strength monitoring tool for Windows and it implements its
monitoring through 3 NT sevices, Oracle databases are typically implemented as 3-5
services. So I don't see a big issue with Nagios's windows client being implemented as
two services.
Thoughs, comments, suggestions?
Tim Shouldice
NSClient Support - http://support.tsmgsoftware.com
This thread is also in the enhancements section of the above forum.
-------------------------------------------------------
Enterprise Linux Forum Conference & Expo, June 4-6, 2003, Santa Clara
The only event dedicated to issues related to Linux enterprise solutions
www.enterpriselinuxforum.com
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list