<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD><TITLE>weirdness in the scheduling of host checks</TITLE>
<META http-equiv=Content-Type content="text/html; charset=us-ascii">
<META content="MSHTML 6.00.6000.16850" name=GENERATOR></HEAD>
<BODY>
<P><FONT face=Arial><FONT size=2><SPAN class=379232718-29062009><FONT
color=#0000ff>I figured out my problem, I had two instances of Nagios
running. That would explain a lot of the scheduling weirdness, maybe all
of it. :-) </FONT></SPAN></FONT></FONT></P>
<P><FONT><FONT face=Arial><FONT size=2><SPAN
class=379232718-29062009>> </SPAN>Is anyone else seeing weird things in
the scheduling of checks? I don't have a good sense of what is wrong but,
it's definitely not the way it was under Nagios 1.0 (or the way it
should <SPAN class=379232718-29062009><FONT
color=#0000ff> </FONT></SPAN>be). I've been watching the scheduling
queue on our Nagios 3 box for a week or so, here's a list of what I've
seen:</FONT></FONT></FONT></P>
<P><FONT face=Arial><FONT size=2><SPAN class=379232718-29062009><FONT
color=#0000ff> > </FONT></SPAN>Under Nagios
3.0.6:</FONT></FONT> <BR><FONT face=Arial size=2> <SPAN
class=379232718-29062009><FONT color=#0000ff> > </FONT></SPAN>-
host checks staying at the top of the queue for a long time (over an hour
sometimes) even when they have a timeout set at 30 seconds</FONT></P>
<P><FONT face=Arial><FONT size=2><SPAN class=379232718-29062009><FONT
color=#0000ff> > </FONT></SPAN>Under Nagios
3.1.6</FONT></FONT> <BR><FONT face=Arial><FONT size=2><SPAN
class=379232718-29062009><FONT color=#0000ff> <FONT face="Times New Roman"
color=#000000 size=3>></FONT> </FONT></SPAN> - host check showing
up unexpectedly in the scheduling queue, this morning when I looked at the queue
the top event was about 15 minutes behind the current time but things were
moving along okay, when I last checked there was a host check at the top of the
queue with a next check time from 4 days ago.</FONT></FONT></P>
<P><FONT face=Arial><FONT size=2><SPAN class=379232718-29062009><FONT
color=#0000ff> > </FONT></SPAN> - We had a host go down
yesterday (Sunday) but we did not get alerted. When I looked at it in
Nagios I noticed the host check was in an OKAY state and the 'last check' value
for it was from 12 days ago (6/17/2009)!</FONT></FONT></P>
<P><FONT face=Arial><FONT size=2><SPAN class=379232718-29062009><FONT
color=#0000ff> > </FONT></SPAN> - Host checks don't seem to be
getting stuck in the queue like they were under 3.0.6, at least not for as
long</FONT></FONT> </P>
<P><FONT face=Arial><FONT size=2><SPAN class=379232718-29062009><FONT
color=#0000ff> > </FONT></SPAN>I'm going to submit a ticket to
tracker.nagios.org but would like to have more empirical evidence of the problem
first, all I have so far are symptoms, no good data points (logs, errors,
etc.). Is anyone else seeing this type of behavior?</FONT></FONT></P>
<P><FONT face=Arial><FONT size=2><SPAN class=379232718-29062009><FONT
color=#0000ff> > </FONT></SPAN>Nagios 3.1.2 (also had trouble with
3.0.6)</FONT></FONT> <BR><FONT face=Arial><FONT size=2><SPAN
class=379232718-29062009><FONT color=#0000ff> <FONT face="Times New Roman"
color=#000000 size=3>></FONT> </FONT></SPAN>RHEL 5 64 bit</FONT></FONT>
</P>
<P> </P></BODY></HTML>