Strange service checks behavior

Sandro Vaz - UOL sandromergvaz at uol.com.br
Wed Dec 1 14:19:09 CET 2004


Folks:

I've read the f... manual, "State Types" section, but I can't understand 
why there is no hard recovery after a hard problem, generating wrong 
availability reports. Let me show you what's in my log files...

Example 1) After a hard problem (6:38:00) we have a weird soft problem 
(6:42:32) and then a soft recovery (06:52:18). I can't find the 
following hard recovery in the logs. Is this correct?

    November 30, 2004 06:00   

[30-11-2004 06:52:18] SERVICE ALERT: 
Client-A-Host-2;Service-X;OK;SOFT;5;OK - 10 enviados, 10 recebidos, 0% 
pacotes perdidos
[30-11-2004 06:51:32] SERVICE ALERT: 
Client-A-Host-2;Service-X;WARNING;SOFT;4;CRITIAL - 10 enviados, 7 
recebidos, 30% pacotes perdidos
[30-11-2004 06:51:10] SERVICE ALERT: 
Client-A-Host-2;Service-X;CRITICAL;SOFT;3;CRITICAL - 10 enviados, 0 
recebidos, 100% pacotes perdidos
[30-11-2004 06:50:04] SERVICE ALERT: 
Client-A-Host-2;Service-X;CRITICAL;SOFT;2;CRITICAL - 10 enviados, 0 
recebidos, 100% pacotes perdidos
[30-11-2004 06:49:04] SERVICE ALERT: 
Client-A-Host-2;Service-X;CRITICAL;SOFT;1;CRITICAL - 10 enviados, 0 
recebidos, 100% pacotes perdidos
[30-11-2004 06:43:04] SERVICE ALERT: 
Client-A-Host-2;Service-X;OK;SOFT;2;OK - 10 enviados, 10 recebidos, 0% 
pacotes perdidos
[30-11-2004 06:42:32] SERVICE ALERT: 
Client-A-Host-2;Service-X;CRITICAL;SOFT;1;CRITICAL - 10 enviados, 4 
recebidos, 60% pacotes perdidos
[30-11-2004 06:38:00] SERVICE ALERT: 
Client-A-Host-2;Service-X;CRITICAL;HARD;1;(Service Check Timed Out)

Example 2) From time 8:19:24 thru 8:24:02, we have a hard problem and a 
hard recovery, which is correct. After that we had a hard problem 
(8:41:54) and then a bizarre critical soft (8:53:26), which I can't 
explain. 8:57:32 we have a Soft Recovery. Again I can't find the hard 
recovery in the log files...

    November 30, 2004 08:00   

[30-11-2004 08:57:32] SERVICE ALERT: 
Client-A-Host-2;Service-X;OK;SOFT;6;OK - 10 enviados, 10 recebidos, 0% 
pacotes perdidos
[30-11-2004 08:57:24] SERVICE ALERT: 
Client-A-Host-2;Service-X;CRITICAL;SOFT;5;CRITICAL - 10 enviados, 0 
recebidos, 100% pacotes perdidos
[30-11-2004 08:56:22] SERVICE ALERT: 
Client-A-Host-2;Service-X;CRITICAL;SOFT;4;CRITICAL - 10 enviados, 0 
recebidos, 100% pacotes perdidos
[30-11-2004 08:55:22] SERVICE ALERT: 
Client-A-Host-2;Service-X;CRITICAL;SOFT;3;CRITICAL - 10 enviados, 0 
recebidos, 100% pacotes perdidos
[30-11-2004 08:54:22] SERVICE ALERT: 
Client-A-Host-2;Service-X;CRITICAL;SOFT;2;CRITICAL - 10 enviados, 0 
recebidos, 100% pacotes perdidos
[30-11-2004 08:53:26] SERVICE ALERT: 
Client-A-Host-2;Service-X;CRITICAL;SOFT;1;(Service Check Timed Out)
[30-11-2004 08:41:54] SERVICE ALERT: 
Client-A-Host-2;Service-X;CRITICAL;HARD;1;(Service Check Timed Out)
[30-11-2004 08:24:02] SERVICE ALERT: 
Client-A-Host-2;Service-X;OK;HARD;1;OK - 10 enviados, 9 recebidos, 10% 
pacotes perdidos
[30-11-2004 08:19:42] SERVICE ALERT: 
Client-A-Host-2;Service-X;CRITICAL;HARD;1;(Service Check Timed Out)

Analyzing these 2 situations, we have a wrong critical period (8:41:54 
through 13:57:43, where we finally have a hard recovery). Some good soul 
could explain this behavior, because without correct logs, Nagios will 
generate unreliable availability reports, because Nagios uses only hard 
states to produce them.

TIA,

SMV



-- 
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.289 / Virus Database: 265.4.4 - Release Date: 30/11/2004



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now. 
http://productguide.itmanagersjournal.com/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list