NRPE vs. check_by_ssh
Kevin Keane
subscription at kkeane.com
Thu Mar 26 17:05:15 CET 2009
Andreas Ericsson wrote:
> Kevin Keane wrote:
>> Christopher McAtackney wrote:
>>> 2009/3/25 Kevin Keane <subscription at kkeane.com>:
>>>
>>>> I think you are comparing apples and oranges here, because in most
>>>> situations that I can think of, the decision is dictated by the
>>>> network
>>>> topology. If you are exclusively on a trusted private network,
>>>> check_by_ssh really doesn't offer any benefits. Conversely, if your
>>>> topology involves the Internet or some other untrusted network (WiFi),
>>>> then you wouldn't want NRPE in the first place.
>>>>
>>>> The only exception to the above that I can think of is when it
>>>> comes to
>>>> deciding between using check_by_ssh over an untrusted network, vs.
>>>> NRPE
>>>> through some other kind of tunnel or VPN. But in that case, you'd
>>>> incur
>>>> encryption overhead either way, and the comparison is very different
>>>> from the question you asked.
>>>>
>>>> All that said: I don't have any first-hand experience, but I suspect
>>>> that the impact of establishing 2200 ssh connections in a five-minute
>>>> span (assuming that you are using a five-minute check interval) is
>>>> pretty substantial. The main impact actually lies in establishing and
>>>> tearing down the connections, key negotiations etc.; the encryption
>>>> during the data phase probably has only limited impact because most
>>>> checks only transmit a few bytes back and forth.
>>>>
>>>> SSH does much better with longer-duration connections when the keys
>>>> are
>>>> already exchanged. This is even more true if you have a router-based
>>>> VPN, because in that case the overhead is offloaded to a different
>>>> machine.
>>>>
>>>> So if you have the option of sending the checks as NRPE through one
>>>> or a
>>>> few long-term VPNs: you are probably going to be better off. Of
>>>> course,
>>>> in the big picture, your mileage may vary.
>>>>
>>> Firstly, thanks for the detailed explanation of the issues involved in
>>> this choice Kevin, it's been very helpful.
>>>
>>> I'm curious though, could you elaborate on why NRPE is unsuitable if
>>> communication with my remote hosts is going to go via the Internet? Is
>>> it not sufficient that NRPE uses SSL? This may be more of a network
>>> security question than a Nagios one, but I've no real experience in
>>> either area unfortunately, so I appreciate any info you can give here.
>>>
>> No, you are right. I wasn't aware that NRPE could use SSL. In that
>> case, NRPE would be pretty much the same in terms of performance as SSL.
>>
>> That said, I am generally concerned from a security standpoint about
>> any kind of active checks going over the Internet. This is because if
>> you are monitoring, in your example, 200 hosts, you have to poke
>> holes into 200 firewalls (or into one firewall, and then set up SSL
>> or SSH keys on 200 hosts). That's 200 potential security holes all
>> over the place with little or no control, and on machines that may
>> not necessarily be hardened for access from the outside world. Worse
>> - active checks, by nature, cause a program to be launched and
>> executed on the monitored client, and usually with very high
>> permissions. You said that you check 2000 services, so that's 2000
>> plugins (give or take a few). What if a hacker found a way to
>> compromise one of your 2000 plugins? You'd have a privilege
>> escalation issue along with remote-launch capability. On 200 clients.
>>
>
> Very high permissions are normally not needed.
Depends on the plugin, but I'm not sure that this is generally true. For
instance, something as simple as log file analysis either requires root
permission on Linux; log files aren't readable by anybody else, or it
requires that you relax file permissions or security somewhere else. On
Windows, I'm running my monitoring agent (by default) as the Local
System account (most Windows services do that anyway). That has
basically full access to everything, but nothing on the network.
Of course check_ping, check_tcp etc. don't usually need such high
permissions.
> I prefer using NRPE because
> of two reasons:
> 1. It provides a rather simple way of specifying exactly which commands
> can be run, and with which arguments (don't enable argument parsing
> in nrpe if the receiving end isn't duly protected by firewalls etc)
> 2. If someone breaks into the Nagios server, he or she does not get the
> public keys required for running commands on the remote servers.
Can you explain that second statement? I'm not sure I follow what you
are trying to say here. Why would getting public keys be a bad thing?
They are, by definition, freely available anyway.
>> Because of these concerns, I am using passive checks almost
>> exclusively over the Internet (except for publicly available services
>> such as HTTP or SMTP, of course); I wrote an agent that resides on
>> the client as a wrapper around the excellent NSClient++ and performs
>> the actual checks. It then forwards the checks to the Nagios server
>> via NSCA over HTTPS. A second benefit is that this agent collects
>> about 40 or so check results, and then sends all of them at once
>> through a single SSL connection. That reduces the overhead of
>> establishing a secure connection by a factor of 40. BTW, the agent is
>> available as Open Source. Go to http://www.tntmonitoring.com .
>
> Sounds like a rather neat solution, although I suppose it has to be
> configured in both ends before it's actually useful (although all other
> agents require some configuration anyways, so perhaps it's not such a big
> deal). I'm not too fond of relinquishing the re-check logic of Nagios
> though, but I guess you can't get everything.
True, you do lose the recheck logic, and you also lose event handlers
and probably some other things I'm not thinking of. Actually, that's a
good point - adding some of these things might be a possible future
improvement.
As far as the configuration on both ends goes, yes, of course. That's
probably true for all Nagios checks regardless of what you do. What you
need server-side is:
- A php page that actually accepts the checks and injects it into Nagios
(downloadable from the Web site). It is the equivalent of configuring
NSCA for regular passive checks.
- An SSL certificate. This seems to be the trickiest part; I was working
with somebody else a couple of days ago who just couldn't get that to
work for a long time. The monitoring client clearly was working, but
couldn't connect to the server because of a self-signed certificate.
- And of course you need to add the host and passive checks to Nagios;
no way around that! I have all the service definitions in hostgroups
rather than individual hosts, so adding the host is as simple as making
it a member of the appropriate host groups.
--
Kevin Keane
Owner
The NetTech
Find the Uncommon: Expert Solutions for a Network You Never Have to Think About
Office: 866-642-7116
http://www.4nettech.com
This e-mail and attachments, if any, may contain confidential and/or proprietary information. Please be advised that the unauthorized use or disclosure of the information is strictly prohibited. The information herein is intended only for use by the intended recipient(s) named above. If you have received this transmission in error, please notify the sender immediately and permanently delete the e-mail and any copies, printouts or attachments thereof.
------------------------------------------------------------------------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list