Nag event handlers restarting failed programs on NT ?

Stanley Hopcroft Stanley.Hopcroft at IPAustralia.Gov.AU
Thu Jan 16 21:32:16 CET 2003

Dear Sir,

I am writing to thank you for reply (I will certainly take your
advice) and summarise for the archives some options,

On Thu, Jan 16, 2003 at 09:33:42AM -0600, Carroll, Jim P [Contractor] wrote:

> > Is anyone using Nagios (event handlers) to restart failed 
> > programs on NT hosts ?

> An interesting thought.  I don't have an answer off the top of my head
> (still working on my first coffee).
> You might wish to check out, subscribe to the
> mailing list (it's quite low-volume) and post a variant of your query there.

Probably in order of seriousness/helpfulness (although I think option 2
is the probably the most durable). 

1 Convert the program - ask someone else to do it - to a service and use
the rpcclient program (from Samba-tng or Samba-2.2.x or Samba-alpha) to
start the service.

This requires that

. the Nag host be set up with a machine account on the MS host
that is running the program/service

. the program can be converted to a service (I understand from a Windows
programmer that in the case of Java applications this can be kludgy).

2 Suggested by Mr T De Blende,

* NSClient checks to see if the program is still running.

In our case, the culprit program will be appending heartbeat messages to
a text file in a shared directory. A Nag service check will 'tail' that
file and return a CRITICAL if it can find no log records newer (the
records will have time stamps) than a the current time minus a threshold
iterval (the last record in the file was logged more than 10 minutes

* If the program is not running, it puts a simple text file on a
Windows share on the server that is supposed to be running that
program. Just share a directory with write rights only for a certain
account that is used by the Nagios box to make the SMB connection.

* Create a small script on the Windows server that checks for the
existance of that text file in that shared directory, and if it is
there: 1) delete it and 2) restart the program.'

(This latter program may be run by AT periodically).

3 Wait for it ... this is my idea.

Write a Tk/Expect or Perl/Tk program to drive a VNC session with the
host and use this VNC session to start the program.

It would probably be a good thing to have the program set up to be run
from the GUI (by clicking an icon that runs a bat file for example).


Yours sincerely.

Stanley Hopcroft

'...No man is an island, entire of itself; every man is a piece of the
continent, a part of the main. If a clod be washed away by the sea,
Europe is the less, as well as if a promontory were, as well as if a
manor of thy friend's or of thine own were. Any man's death diminishes
me, because I am involved in mankind; and therefore never send to know
for whom the bell tolls; it tolls for thee...'

from Meditation 17, J Donne.

This SF.NET email is sponsored by:
Understand how to protect your customers personal information by implementing
SSL on your Apache Web Server. Click here to get our FREE Thawte Apache 

More information about the Users mailing list