(no subject)
Joseph B. McQueen
jmcqueen at wpsc.com
Thu Aug 7 14:43:20 CEST 2003
Our connectivity is as shown:
remote node -> Cisco 2600 -> Frac T1 -> Cisco 3600 -> Core Switch ->
Monitoring Server
This is pretty much the same for most remote devices. I have reproduced
the problem many times now on various hardware and OS vendors. The
connectivity is classic of most enterprise networks.
I often use our existing management server (What's Up Gold) to baseline
whether this device is actually up or not. I'm running it in parallel
with Nagios to ensure that we are operating flawlessly on the new system
before removing the old. What's Up Gold (connected to the same switch)
does not exhibit this problem, indicating more and more that it is a
"Linux" issue.
I appreciate your input on this. I'm pretty sure it's related directly
to Linux and not the network. I was hoping someone might have seen this
before as I seem to be having a hard time determining the root of the
problem.
Rob Nelson wrote:
>
>> There is no VPN involved as our network is only private circuits. As
>> for a route, that would not explain why I could telnet to the device,
>> but not ping. The problem is very specific to ICMP. As well, having
>> another machine on the same switch being able to ping the device
>> indicates it is not related to the network, but more specifically to
>> the Nagios server. I'm only running a default gateway with no routing
>> protocols and no static routes. I've checked the routes on the device
>> before and after the problem and the do not change. It is a very
>> wierd problem.
>
>
> Yes, but on this other host, were you pinging the machine before it
> went down? It could be anything from an arp bridge table holding bad
> information across a link recycle to screwed up ICMP access-lists. If
> you can duplicate the before-and-after picture entirely, then I'd
> guess on the machine, but not until you have another machine pinging
> the same device before, during, and after the link goes up and down.
>
> How is this device connected? Most of my hosts are something like:
>
> remote node -> [wireless eq] -> PIX firewall <- VPN -> central PIX
> firewall -> monitoring server
>
> If the node goes down and reconnects on a different wireless unit, we
> were having some problems because someone set the entire site's
> equipment to use a bridge learn timeout of 700 minutes. Possibly
> something similar?
>
>
> |Rob Nelson
> Network Administrator, Capitol Broadband
> C: 919-369-1874
> rob at capband.net
>
>
-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list