Nagios and a Microsoft "cluster"
Andrew Grimberg
tykeal at bardicgrove.org
Fri Dec 16 05:34:35 CET 2005
On Thu, 2005-12-15 at 21:21 -0500, Bill Mathews wrote:
> Hans Engelen wrote:
>
> >Could you provide a little more information on the issue at hand ?
>
> >A short description of your setup would help. We are defenitly talking
> about clustering here right, >not load balancing ? The double ping
> response issue sounds more like a load balancing issue.
>
>
> Well, it's allegedly a failover cluster. I don't control the Microsoft
> end of it and I've been told it's a failover. The only details I have is
> that we're monitoring it across the Internet, along with a few hundred
> other machines and that site is the only one giving us this trouble.
> It's somewhat odd.
>
> Setup is nagios latest running on Debian stable, not sure what other
> setup info would help.
Strange. I would agree with Hans that it sounds a lot like a MS NLBS
(Network Load Balance Services) cluster and not a MSCS (MS Cluster
Services). However, since you say your monitoring from across the
Internet and not the local network the only way that I know of to get
that kind of response would be to an MS NLBS on a VMware ESX or GSX
server. Though, it's potentially possible to elicit that kind of
response from a even from a non-local machine.
They may be running a NLBS "failover" cluster and not a true MSCS. NLBS
can run in a full load balance or in an active / passive (aka failover)
configuration. As NLBS does some stupid packet magic to get the
clustering to work it can have undesired side effects of all nodes in
the cluster replying to pings.
If you can get your customer to give you more information on the
cluster, I bet they will tell you that it is an NLBS cluster in unicast
mode. You might suggest to them that they work with their network techs
to switch over to multicast mode, that should hopefully help a little
with the problem. A better suggestion is for them to get a hardware
load balancer instead of using Microsoft's NLBS solution. As we've
discovered on our network, it causes a lot of problems on the media
layer that kills some network security.
-Andy-
-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list