This is a confirmed bug in 3.2.0. We Europeans noticed this last week when we switched from DST.<div><br></div><div>Start of thread on the problem: <a href="http://www.mail-archive.com/nagios-users@lists.sourceforge.net/msg29695.html">http://www.mail-archive.com/nagios-users@lists.sourceforge.net/msg29695.html</a></div>
<div><br></div><div>There are a bunch of suggestions there on how to get your checks scheduled sanely again without losing acknowledgements etc.</div><div><br></div><div>Regards,</div><div>Martin Melin</div><div><br><div class="gmail_quote">
On Tue, Nov 3, 2009 at 5:03 AM, Frost, Mark {PBG} <span dir="ltr"><<a href="mailto:mark.frost1@pepsi.com">mark.frost1@pepsi.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<div bgcolor="white" lang="EN-US" link="blue" vlink="purple">
<div>
<p class="MsoNormal"><span style="font-size:10.0pt;color:#1F497D">I guess I'm another "me too". We use Nagios
3.0.6, but I had just setup an upgrade to 3.2.0. From what I could see,
my distributed nodes had been sending data just fine for 45 minutes or
so. When I double-checked the performance graphs just before retiring for
the night I saw no data coming in. When I traced this back to the
distributed nodes, their scheduling queue showed no checks scheduled until well
into the next day. We have some checks that run as frequently as every minute.</span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;color:#1F497D"> </span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;color:#1F497D">I assumed this was a weird bug with 3.2.0, panicked and went
back to 3.0.6 a little after midnight and things have been fine ever since.
I was going to spend more time observing 3.2.0 in a more contained environment
to see if this was normal behavior. My timing (checks stopping around
11pm Sunday night) sounds the same so perhaps it's not just my imagination.</span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;color:#1F497D"> </span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;color:#1F497D">One thing that bothered me a bit was that I didn't see messages
in the central servers indicating that it was marking service checks as stale
and checking automatically. I saw no stale messages in the log and it
should have been well past the freshness thresholds of most checks. As I
say, it was late and I decided to roll back before I investigated.</span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;color:#1F497D"> </span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;color:#1F497D">I've got thousands of service checks so forcing rescheduling
wouldn't work for me.</span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;color:#1F497D"> </span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;color:#1F497D">Mark</span></p>
<p class="MsoNormal"><span style="font-size:10.0pt;color:#1F497D"> </span></p>
<div style="border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in 4.0pt">
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="font-size:10.0pt;color:windowtext">From:</span></b><span style="font-size:10.0pt;color:windowtext"> Les Fenison
[mailto:<a href="mailto:les@deltatechnicalservices.com" target="_blank">les@deltatechnicalservices.com</a>] <br>
<b>Sent:</b> Monday, November 02, 2009 9:47 PM<br>
<b>To:</b> Andy Howell<br>
<b>Cc:</b> <a href="mailto:nagios-users@lists.sourceforge.net" target="_blank">nagios-users@lists.sourceforge.net</a><br>
<b>Subject:</b> Re: [Nagios-users] Nagios stopped checking most of my services!</span></p>
</div>
</div><div><div></div><div class="h5">
<p class="MsoNormal"> </p>
<p class="MsoNormal">Well, so far 3 of us with the same problem on the same
day. I have to believe it is daylight savings time related. <br>
<br>
My fix is to go click on each service one by one and reschedule. Then
they start checking normally again.<br>
<br>
I wonder if there is anyway to force an automatic reschedule of all services
and hosts for next year when this happens again?<br>
<br>
Andy Howell wrote: </p>
<p class="MsoNormal">Les Fenison wrote: <br>
<br>
</p>
<p class="MsoNormal">I had nagios working great. Checking 6 hosts and about
85 services. Then suddenly, all services on all hosts except one stopped
checking. The next scheduled check is about 24 hours from the last
check. I had been checking every 5 minutes. <br>
<br>
Restarting nagios didn't help. I am using a gui NagioSQL to
edit my configuration files so I suspect it did something to me but I have no
clue where to look except where I have already looked. <br>
<br>
What can cause nagios to just stop checking everything like that or to randomly
switch to every 24 hours rather than the configured every 5 minutes? <br>
<br>
I am having to manually do force checks to get it to check. <br>
<br>
Here are some things I have checked... <br>
<br>
Hosts check_interval is 5, retry_interval is 1 <br>
Services check_interval is 10, retry_interval is 2 <br>
<br>
So where could Nagios be getting the idea that it is suppose to be every 24
hours? </p>
<p class="MsoNormal" style="margin-bottom:12.0pt"><br>
I had the same experience yesterday. Maybe daylight savings related? At about
11pm, all the services were scheduled for 11pm the following day. I figured it
was something I did wrong. I noticed that "next_check" time in
/var/log/nagios/retention.dat was wrong. I renamed the file and restarted
nagios. It worked fine after that. <br>
<br>
I using version 3.2. <br>
<br>
Regards, <br>
<br>
Andy </p>
<p class="MsoNormal"> </p>
<div>
<p class="MsoNormal">-- </p>
<div class="MsoNormal" align="center" style="text-align:center">
<hr size="2" width="100%" align="center">
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.0pt;color:#5555EE">Les Fenison<br>
Delta Technical Services<br>
<a href="http://www.DeltaTechnicalServices.com" target="_blank">www.DeltaTechnicalServices.com</a><br>
<a href="mailto:les@DeltaTechnicalServices.com" target="_blank">les@DeltaTechnicalServices.com</a><br>
503-766-0076 </span></p>
</div>
</div>
</div></div></div>
</div>
</div>
<br>------------------------------------------------------------------------------<br>
Come build with us! The BlackBerry(R) Developer Conference in SF, CA<br>
is the only developer event you need to attend this year. Jumpstart your<br>
developing skills, take BlackBerry mobile applications to market and stay<br>
ahead of the curve. Join us from November 9 - 12, 2009. Register now!<br>
<a href="http://p.sf.net/sfu/devconference" target="_blank">http://p.sf.net/sfu/devconference</a><br>_______________________________________________<br>
Nagios-users mailing list<br>
<a href="mailto:Nagios-users@lists.sourceforge.net">Nagios-users@lists.sourceforge.net</a><br>
<a href="https://lists.sourceforge.net/lists/listinfo/nagios-users" target="_blank">https://lists.sourceforge.net/lists/listinfo/nagios-users</a><br>
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.<br>
::: Messages without supporting info will risk being sent to /dev/null<br></blockquote></div><br></div>