Question about NRPE operation
Andreas Ericsson
ae at op5.se
Fri Sep 9 10:41:13 CEST 2005
schönfeld / in-medias-res wrote:
> Hi,
>
> i'm having some problems with checking a ncpfs filesystem and
> got a suspicious on my mind, so i have a question about how
> the NRPE does operate.
>
> Ok, here we go:
> Nagios initiates a check of a particular service on Host X,
> so it does send a request to the NRPE daemon on Host X.
>
> Host X checks the request and starts the plugin which can do the
> requested service check and switches into wating state => waiting for
> the plugin answere.
>
> Now imagine that the ncpfs is busy, because of another "heavy operation"
> on it. So the plugin runs and runs, but has to wait for the filesystem
> and does not return a result to the nrpe in the meanwhile.
>
The plugin itself is supposed to exit gracefully after some specified
amount of maximum time. This is generally achieved by installing a
signal-handler to catch SIGALRM and making an alarm(2) call. The
signal-handler should make sure all locks and resources are released
(the kernel will handle it otherwise, but that's considered terribly bad
form).
> So now the question is: Does the NRPE Server has an timeout after which
> it'll *kill* the plugin?
Yes, naturally. Otherwise it could risk filling up the process-table, or
plugins with infinite loops could bring the entire system down.
> If so: Linux ncpfs is not able of threading
> ncpfs operations. So if one process is accessing the ncpfs and gets
> a SIGKILL, the ncp connection becomes invalid and the source of my
> problem would be identified.
>
This really can't be. Any locks and resources held by a terminated
process should be cleared by the kernel (if not by the process itself).
If they aren't, you've found a kernel bug.
--
Andreas Ericsson andreas.ericsson at op5.se
OP5 AB www.op5.se
Lead Developer
-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list