NRPE vs. check_by_ssh

Charlie Reddington charlie.reddington at gmail.com
Wed Mar 25 21:13:28 CET 2009


On Mar 25, 2009, at 2:30 PM, RijilV wrote:

> 2009/3/24 Christopher McAtackney <cristoir at gmail.com>:
>> Hi all,
>>
>> I was wondering if someone could give a brief overview of the pros /
>> cons of using NRPE to monitor my remote hosts versus using the
>> check_by_ssh command?
>>
>> I'm aware that check_by_ssh increases the CPU overhead, but I'm not
>> clear on the level of impact here - does this increase the load on  
>> the
>> monitoring machine in direction relation to the number of hosts being
>> monitored? For example, if I was using check_by_ssh to monitor, say,
>> 2000 services spread across 200 hosts, would I experience significant
>> slowdown on my monitoring machine?
>>
>> Cheers for any info,
>>
>> Chris
>>
>
>
> SSH is going to slow it down on both sides of the communication.  SSH
> does quite a bit more in terms of setting up the connection which
> involves using asymmetric encryption to setup a shared secret for
> symmetric encryption and verifying keys for the asymmetric part,
> verifying access, allocating a session.  Whereas NRPE even with
> encryption just does a simple pre-shared secret for the symmetric
> encryption, much faster even if using the same encryption algorithm
>
>
> One thing you could do with SSH to speed it up (and I would argue make
> it faster than NRPE depending on the stability of your network)) would
> be to use ControlMaster.  ControlMaster is a SSH v2 feature, where you
> create a connection and can open up multiple sessions with that
> ControlMaster for other SSH processes.  This saves you not only the
> key-exchange heavy lifting but also you're not opening up a new socket
> on the remote host.  In order to really make it worth it you'd have to
> spawn a process that was continuously connected.  I wrote an ugly
> check_by_ssh that would spawn a ControlMaster if one didn't exist and
> use it if it did.  Reduced the load/latency quite a bit for SSH
> checks.  Though if I had to do it again I'd used 'ControlMaster auto'
> (man 5 ssh_config) and create a separate check that was responsible
> for maintaining the ControlMaster, then you could use the stock
> check_by_ssh without any modifications.
>
>
> That all being said, you might want to think about a distributed setup
> anyhow, if nothing more for redundancy.  200 servers and 2,000 checks
> is alot of responsibility for a singleton, you could break it 50/50
> between two servers that could take over for the other one if it
> fails.
>
>
> .r'
>
> ------------------------------------------------------------------------------
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when  
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null

+1 on the control master. We have about 1000 checks over 300 hosts and  
using control master made the box much more stable and quite frankly  
usable. Saved a lot of plug in time outs as well.

Think about 1000 checks every 5 or 10 minutes. That's 1000 encrypted  
tunnels that are going up and down. That's a lot of overhead for a  
quick check, let along if your server is checking say 5 or 10 things  
back to back.

http://www.torchbox.com/blog/ssh_tips_2.html

Charlie

------------------------------------------------------------------------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list