Regarding Trends status after Network Outage
BOLLENGIER Eric
ebollengier at sigma.fr
Thu Dec 2 11:04:59 CET 2004
Hi,
I have the same bug (nagios 1.2), in a race condition (after a host
reboot).
ssh down -> reboot -> host up -> ssh down -> ssh up
[1099042385] SERVICE ALERT: test;ssh;CRITICAL;SOFT;1;Connection refused
[1099042445] SERVICE ALERT: test;ssh;CRITICAL;SOFT;2;Socket timeout
[1099042525] SERVICE ALERT: test;ssh;CRITICAL;HARD;3;Socket timeout
[1099042715] HOST ALERT: test;DOWN;SOFT;1;CRITICAL
[1099042725] HOST ALERT: test;DOWN;SOFT;2;CRITICAL
[1099042735] HOST ALERT: test;DOWN;SOFT;3;CRITICAL
[1099042745] HOST ALERT: test;DOWN;SOFT;4;CRITICAL
[1099042755] HOST ALERT: test;DOWN;HARD;5;CRITICAL
[1099042755] SERVICE ALERT: test;ping;CRITICAL;HARD;1;CRITICAL
[1099042935] HOST ALERT: test;UP;HARD;1;PING OK
[1099042935] SERVICE ALERT: test;ping;OK;HARD;1;PING OK
[1099042945] SERVICE ALERT: test;ssh;CRITICAL;SOFT;1;Socket timeout
[1099043005] SERVICE ALERT: test;ssh;OK;SOFT;2;TCP OK
====> BUG ssh is in CRITICAL HARD STATE, but OK is SOFT !!
[1099043265] SERVICE ALERT: test;ssh;CRITICAL;SOFT;1;Socket timeout
[1099043335] SERVICE ALERT: test;ssh;CRITICAL;SOFT;2;Socket timeout
[1099043395] SERVICE ALERT: test;ssh;CRITICAL;HARD;3;Socket timeout
[1099043475] HOST ALERT: test;DOWN;SOFT;1;CRITICAL
[1099043485] HOST ALERT: test;DOWN;SOFT;2;CRITICAL
[1099043495] HOST ALERT: test;DOWN;SOFT;3;CRITICAL
[1099043505] HOST ALERT: test;DOWN;SOFT;4;CRITICAL
[1099043515] HOST ALERT: test;DOWN;HARD;5;CRITICAL
[1099043565] SERVICE ALERT: test;ping;CRITICAL;HARD;1;CRITICAL
[1099043715] HOST ALERT: test;UP;HARD;1;PING OK
[1099043715] SERVICE ALERT: test;ping;OK;HARD;1;PING OK
[1099043745] SERVICE ALERT: test;ssh;CRITICAL;SOFT;1;Socket timeout
[1099043815] SERVICE ALERT: test;ssh;CRITICAL;SOFT;2;Socket timeout
[1099043865] SERVICE ALERT: test;ssh;OK;HARD;3;TCP OK
=====> hier it's ok, because ssh goes up after 2 test
If you want look this bug in your nagios log file, you could use
my simple perl script (see attachment)
PS :
to use it
for i in nagios-*2004*
do
./mayday_bug_trends.pl $i
done
Regards
Le jeudi 02 décembre 2004 à 10:05 +0530, Nilesh a écrit :
> Dear All,
>
> I have noticed a strange behaviour of Trends in nagios.
> I'm using nagios-1.2
>
> When ever there is a network outage, It is updating information
> immediately for the same.
> After Recover of network connectivity all host check and service checks
> are getting checked and updating information
> for availability of hosts and services. But many times Trends keeps on
> continuin with either "HOST UNREACHABLE" status and services with
> "CRITICAL" status.
>
> In such cases when i reboots nagios server then it is recovering it ,
> but it is not a solution.
>
> So how to resolve this problem.
> What i want is, as soon as host &/OR service check get success after
> network outage, Trends Must get update immediately.
>
> Waiting For Reply
> With regards
>
> Linux Admin
>
--
Eric BOLLENGIER, Administrateur Système - Poste 1325
SIGMA Informatique http://www.sigma.fr
3 rue Newton, BP 4127, 44241 La Chapelle sur Erdre Cedex
tel : 02.40.37.14.00
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mayday_bug_trends.pl
Type: application/x-perl
Size: 1172 bytes
Desc: not available
URL: <https://www.monitoring-lists.org/archive/users/attachments/20041202/56bb5f0f/attachment.bin>
More information about the Users
mailing list