<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=us-ascii">
<META content="MSHTML 6.00.2900.2963" name=GENERATOR></HEAD>
<BODY>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=196462322-09112006>We dealt with this by installing a local
caching-only nameserver on the Nagios host itself. This also took a lot of
the load off of the main nameservers. So, resolv.conf was set to use
127.0.0.1 by default and have our normal name servers as secondaries. A
nice sideeffect was that it vastly sped up the name
resolution.</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=196462322-09112006></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT face=Arial color=#0000ff size=2><SPAN
class=196462322-09112006>Steve</SPAN></FONT></DIV>
<DIV> </DIV><!-- Converted from text/plain format -->
<P><FONT size=2>--<BR>Steve Shipway<BR>ITSS, University of Auckland<BR>(09) 3737
599 x 86487<BR>s.shipway@auckland.ac.nz<BR><BR></FONT></P>
<DIV> </DIV><BR>
<BLOCKQUOTE
style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #0000ff 2px solid; MARGIN-RIGHT: 0px">
<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left>
<HR tabIndex=-1>
<FONT face=Tahoma size=2><B>From:</B>
nagios-users-bounces@lists.sourceforge.net
[mailto:nagios-users-bounces@lists.sourceforge.net] <B>On Behalf Of
</B>stucky<BR><B>Sent:</B> Friday, 10 November 2006 6:57 a.m.<BR><B>To:</B>
Az<BR><B>Cc:</B> nagios<BR><B>Subject:</B> Re: [Nagios-users] timeouts when
using secondary dns<BR></FONT><BR></DIV>
<DIV></DIV>Yey !! That totally did it. Thx AZ I hadn't even considered messing
with the resolver cuz I was sure it was a nagios issue so I had to fix
nagios.<BR>If that wasn't a text book example of how well mailinglists can
work then I don't know what is... <BR><BR>thx<BR><BR>
<DIV><SPAN class=gmail_quote>On 11/7/06, <B class=gmail_sendername>Az</B>
<<A href="mailto:az@whoever.org">az@whoever.org</A>> wrote:</SPAN>
<BLOCKQUOTE class=gmail_quote
style="PADDING-LEFT: 1ex; MARGIN: 0pt 0pt 0pt 0.8ex; BORDER-LEFT: rgb(204,204,204) 1px solid">stucky
wrote:<BR>> I use the check_by_ssh plugin for most of my stuff and I
noticed that<BR>> if the primary nameserver is unavailable nagios starts
freaking out.<BR>> All of a sudden all plugins time out. I tested it
using the 'host' <BR>> command and it only takes about 1 second longer to
lookup hosts using<BR>> the secondary nameserver.<BR>> The default
timeout for check_by_ssh is 10 seconds. I cranked it up to<BR>> 30 and
still I get timeouts. I'm not sure I understand that one. <BR>> Has
anyone else seen this.<BR>We had a similar issue in that our primary DNS was
doing strange things,<BR>and it quite often took 5 or even 10 seconds to
perform a DNS lookup.<BR>What we were seeing was 70% of service checks (and
subsequently host <BR>checks) failing by timing out. The key was the
multiple of 5 seconds.<BR>The resolver timeout on, say, RHEL3 is based on
RES_TIMEOUT in<BR>resolv.h... which was 5 seconds.<BR><BR>We added the
following to our resolv.conf , and found the problems went
away:<BR><BR> options timeout:2 rotate<BR><BR>This
sets the timeout for waiting for a reply to 2 seconds, and tells<BR>the
resolve to rotate through your 'nameserver' entries rather than<BR>always
hitting #1, then #2,
etc.<BR><BR>Cheers.<BR><BR><BR><BR><BR></BLOCKQUOTE></DIV><BR><BR
clear=all><BR>-- <BR>stucky </BLOCKQUOTE></BODY></HTML>