logging and report issues
Edgar Shine
eshine at gmail.com
Fri Jul 29 22:04:08 CEST 2005
Greetings,
I´m trying to understand how can I adjust some report issues. Maybe
someone could help me.
I manage a WAN radio network and I´m using Nagios to monitor these
devices. Radio links often have some kind of problem in the RTT in rainy
days mainly, and I want to manage statistics of these problems.
The problem is when I have a alarm report in the first scheduled
polling, I didn´t have an recovery log. And I´m pretty sure the host
recovered status during the day, Monitoring show me the host service was OK.
Here is some lines in the log files, relating to the attached file.
---BEGIN---
root at phoenix:/usr/local/nagios/var/archives# grep castanheiras
nagios-07-07-2005-00.log
[1120705200] CURRENT HOST STATE: castanheiras;UP;HARD;1;PING OK - Packet
loss = 0%, RTA = 259.77 ms
[1120705200] CURRENT SERVICE STATE:
castanheiras;PING;CRITICAL;HARD;3;PING CRITICAL - Packet loss = 0%, RTA
= 288.31 ms
root at phoenix:/usr/local/nagios/var/archives# grep castanheiras
nagios-07-08-2005-00.log
[1120791600] CURRENT HOST STATE: castanheiras;UP;HARD;1;PING OK - Packet
loss = 0%, RTA = 376.73 ms
[1120791600] CURRENT SERVICE STATE: castanheiras;PING;OK;HARD;1;PING OK
- Packet loss = 0%, RTA = 134.32 ms
root at phoenix:/usr/local/nagios/var/archives# grep castanheiras
nagios-07-14-2005-00.log
[1121310000] CURRENT HOST STATE: castanheiras;UP;HARD;1;PING OK - Packet
loss = 0%, RTA = 114.10 ms
[1121310000] CURRENT SERVICE STATE:
castanheiras;PING;CRITICAL;HARD;3;PING CRITICAL - Packet loss = 0%, RTA
= 2439.42 ms
root at phoenix:/usr/local/nagios/var/archives# grep castanheiras
nagios-07-15-2005-00.log
[1121396400] CURRENT HOST STATE: castanheiras;UP;HARD;1;PING OK - Packet
loss = 0%, RTA = 199.86 ms
[1121396400] CURRENT SERVICE STATE:
castanheiras;PING;CRITICAL;HARD;3;PING CRITICAL - Packet loss = 0%, RTA
= 213.32 ms
root at phoenix:/usr/local/nagios/var/archives# grep castanheiras
nagios-07-16-2005-00.log
[1121482800] CURRENT HOST STATE: castanheiras;UP;HARD;1;PING OK - Packet
loss = 0%, RTA = 200.23 ms
[1121482800] CURRENT SERVICE STATE: castanheiras;PING;OK;HARD;1;PING OK
- Packet loss = 0%, RTA = 102.60 ms
---EOT---
Logging is working if we have some alarm during he day **if** we don´t
have a critical state in the current service state when Nagios rotate
the logfile:
---BEGIN---
root at phoenix:/usr/local/nagios/var/archives# grep castanheiras
nagios-07-29-2005-00.log | grep HARD
[1122519600] CURRENT HOST STATE: castanheiras;UP;HARD;1;PING OK - Packet
loss = 0%, RTA = 344.05 ms
[1122519965] SERVICE ALERT: castanheiras;PING;CRITICAL;HARD;5;PING
CRITICAL - Packet loss = 0%, RTA = 451.80 ms
[1122523904] SERVICE ALERT: castanheiras;PING;OK;HARD;5;PING OK - Packet
loss = 0%, RTA = 34.14 ms
[1122526189] SERVICE ALERT: castanheiras;PING;CRITICAL;HARD;5;PING
CRITICAL - Packet loss = 0%, RTA = 551.29 ms
[1122527084] SERVICE ALERT: castanheiras;PING;OK;HARD;5;PING OK - Packet
loss = 0%, RTA = 36.09 ms
---EOT---
My host.conf
---BEGIN---
define host{
name host_template
notifications_enabled 1
event_handler_enabled 0
flap_detection_enabled 1
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
register 0
}
define host{
use host_template ; Name of host
template to use
host_name castanheiras
alias Condominio Castanheiras
address 10.140.215.2
parents cisco_nobhill
check_command check-host-alive
max_check_attempts 5
contact_groups noc
notification_interval 0
notification_period 24x7
notification_options d,u,r
}
---EOT---
My services.cfg
---BEGIN---
define service{
name generic-service
active_checks_enabled 1
passive_checks_enabled 1
parallelize_check 1
obsess_over_service 1
check_freshness 0
notifications_enabled 1
event_handler_enabled 0
flap_detection_enabled 1
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
register 0
}
define service{
use generic-service
host_name castanheiras
service_description PING
is_volatile 0
check_period 24x7
max_check_attempts 5
normal_check_interval 5
retry_check_interval 3
contact_groups noc,redes,manutencao
notification_interval 0
notification_period 24x7
notification_options c,r
check_command check_ping!199.99,79%!200.0,80%
}
---EOT---
System info:
---BEGIN---
Nagios vs. : 2.0b3
Linux 2.4.29
Apache 1.3.33
gd 2.0.33
---END---
Any tips? Thanks in advance for your time.
rgds,
Edgar Shine
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ScreenShot004.jpg
Type: image/jpeg
Size: 201801 bytes
Desc: not available
URL: <https://www.monitoring-lists.org/archive/users/attachments/20050729/61947126/attachment.jpg>
More information about the Users
mailing list