<div dir="ltr">Hi again,<br><br>I edited the Nagios history log, because the original was too big to be sent on the list. All information is still there though. Sorry by the inconvenience. <br><br>The original message is bellow.<br>
<br>Best regards,<br>Rafael Barbosa<br><br><div class="gmail_quote">---------- Forwarded message ----------<br>From: <b class="gmail_sendername">Rafael Barbosa</b> <span dir="ltr"><<a href="mailto:rrbarbosa@gmail.com">rrbarbosa@gmail.com</a>></span><br>
Date: Fri, Aug 8, 2008 at 8:59 AM<br>Subject: Flapping and notifications<br>To: <a href="mailto:nagios-users@lists.sourceforge.net">nagios-users@lists.sourceforge.net</a><br><br><br><div dir="ltr">Hello everybody,<br><br>
I have been using Nagios for about 4-5 months now to monitor a small network (we are expanding the number of hosts soon) at my university. Our goal is to be aware of the status of the hosts and some services on the network, and be notificated every time something goes wrong, task for which Nagios seems perfectly suitable.<br>
<br>By analyzing the last days logs I encountered a strange situation. In two occasions I received two consecutive Critical state notifications for the PING check without receiving a Warning or OK notification in between, and this situation happened more then once. I have been looking on the "Host Alert History" to try to figure out why these notifications were sent, when I got even more lost. In some cases on the alert history I don't see the HARD state change that would trigger the alert. My best guess is that these consecutive notifications are cause by flapping of the host/device, but I am not sure about it. After I enabled flapping notifications I did not see this problem anymore, but maybe I just dont have enough data to observe this problem. Note that when the problem happened, flapping was detected but not notificated. Another thing that I miss on my log is that sometimes I dont see the "Flapping Stopped" alert on the log, which makes even more difficult to find the cause of the problem.<br>
<br>These are the two critical notifications I got from "Host Notifications" and attached is the "Host Alert History" for the same day (hope is not too big).<br><br><table border="0"><tbody><tr>
<td><a href="http://www.sensordatalab.org/nagios/cgi-bin/extinfo.cgi?type=1&host=ubisense1105" target="_blank">ubisense1105</a></td>
<td><a href="http://www.sensordatalab.org/nagios/cgi-bin/extinfo.cgi?type=2&host=ubisense1105&service=PING" target="_blank">PING</a></td>
<td>OK</td>
<td>2008-08-01 18:38:20</td>
<td><a href="http://www.sensordatalab.org/nagios/cgi-bin/config.cgi?type=contacts#nagiosadmin" target="_blank">nagiosadmin</a></td>
<td><a href="http://www.sensordatalab.org/nagios/cgi-bin/config.cgi?type=commands#notify-service" target="_blank">notify-service</a></td>
<td>PING OK - Packet loss = 0%, RTA = 1.88 ms</td>
</tr>
<tr>
<td><a href="http://www.sensordatalab.org/nagios/cgi-bin/extinfo.cgi?type=1&host=ubisense1105" target="_blank">ubisense1105</a></td>
<td><a href="http://www.sensordatalab.org/nagios/cgi-bin/extinfo.cgi?type=2&host=ubisense1105&service=PING" target="_blank">PING</a></td>
<td>CRITICAL</td>
<td>2008-08-01 17:14:30</td>
<td><a href="http://www.sensordatalab.org/nagios/cgi-bin/config.cgi?type=contacts#nagiosadmin" target="_blank">nagiosadmin</a></td>
<td><a href="http://www.sensordatalab.org/nagios/cgi-bin/config.cgi?type=commands#notify-service" target="_blank">notify-service</a></td>
<td>PING CRITICAL - Packet loss = 100%</td>
</tr>
<tr>
<td><a href="http://www.sensordatalab.org/nagios/cgi-bin/extinfo.cgi?type=1&host=ubisense1105" target="_blank">ubisense1105</a></td>
<td><a href="http://www.sensordatalab.org/nagios/cgi-bin/extinfo.cgi?type=2&host=ubisense1105&service=PING" target="_blank">PING</a></td>
<td>CRITICAL</td>
<td>2008-08-01 12:42:30</td>
<td><a href="http://www.sensordatalab.org/nagios/cgi-bin/config.cgi?type=contacts#nagiosadmin" target="_blank">nagiosadmin</a></td>
<td><a href="http://www.sensordatalab.org/nagios/cgi-bin/config.cgi?type=commands#notify-service" target="_blank">notify-service</a></td>
<td>PING CRITICAL - Packet loss = 100%</td></tr></tbody></table><br>Does anybody knows what could cause this behaviour? When a host is flapping, notifications about its services are still issued? How is decided what is logged on the "Host Alert History"?<br>
<br>Best regards,<br>Rafael Barbosa<br></div>
</div><br></div>