Distributed configuration issue with staleness (thresholds?)
Bob Johnson
bobjohnson at nexus9000.com
Tue Jun 28 22:31:09 CEST 2005
Greetings to all,
In my test configuration, I have one server as the distributed node and
the other as the master node. The distributed node does all of the
checking and sends its check results to the master node via NSCA. The
checks are sent (and received) in a normal fashion to the master node, but
for some reason I am having issues with the freshness threshold on the
master server. The nagios.log excerpt below states that the check is
stale by "7" seconds even though there is a threshold of "200" seconds.
Therefore, I believe that I must be overlooking something in the
configuration and would appreciate any advice. (As a side note, quite a
few services go into the stale mode at once, and not just this single
check. However, not *every* service immediately goes stale.)
>From nagios.log on the master server:
"Warning: The results of service 'time' on host 'server1' are stale by 7
seconds (threshold=200 seconds). I'm forcing an immediate check of the
service."
Alas, I have a few questions:
a. Where exactly is this "7" coming from (or calculated) in the
configuration?
b. Where exactly is this "200" coming from (or calculated) in the
configuration?
c. Is there a recommended complete (yet barebones) master and distributed
node configuration reference for nagios.cfg, hosts.cfg, and services.cfg?
d. Are there any other logs or additional debugging details which would be
useful for this distributed staleness issue?
---------------------------------------------------------
[distributed node: hosts.cfg]
# Generic host definition template
define host{
name nagios-host
notifications_enabled 0
event_handler_enabled 1
flap_detection_enabled 1
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
register 0
}
# 'server1' host definition
define host{
use nagios-host
host_name server1
alias server1
address 10.10.10.10
check_command check-host-alive
contact_groups nagios-admins
max_check_attempts 10
notification_interval 120
notification_period 24x7
notification_options d,u,r
}
---------------------------------------------------------
[distributed node: services.cfg]
define service{
name nagios-host
active_checks_enabled 1
passive_checks_enabled 0
parallelize_check 1
obsess_over_service 1
notifications_enabled 0
notification_interval 60
notification_period 24x7
notification_options w,u,c,r
event_handler_enabled 1
flap_detection_enabled 1
contact_groups nagios-admins
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
is_volatile 0
max_check_attempts 3
check_period 24x7
normal_check_interval 3
retry_check_interval 1
register 0
}
define service{
use nagios-host
host_name server1
service_description time
check_command check_ntp!3!10
}
---------------------------------------------------------
[master node: hosts.cfg]
# Generic host definition template
define host{
name nagios-host
notifications_enabled 1
event_handler_enabled 1
flap_detection_enabled 1
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
register 0
}
# 'server1' host definition
define host{
use nagios-host
host_name server1
alias server1
address 10.10.10.10
max_check_attempts 3
contact_groups nagios-admins
notification_interval 120
notification_period 24x7
notification_options d,u,r
}
---------------------------------------------------------
[master node: services.cfg]
# infrastructure host template
define service{
name nagios-host
active_checks_enabled 0
passive_checks_enabled 1
parallelize_check 1
obsess_over_service 1
check_freshness 1
freshness_threshold 300
check_command service-is-stale
notifications_enabled 1
notification_interval 60
notification_period 24x7
notification_options w,u,c,r
event_handler_enabled 1
flap_detection_enabled 1
contact_groups nagios-admins
process_perf_data 1
retain_status_information 1
retain_nonstatus_information 1
is_volatile 0
max_check_attempts 3
check_period 24x7
normal_check_interval 3
retry_check_interval 1
register 0
}
define service{
use nagios-host
host_name server1
service_description time
}
---------------------------------------------------------
Thank you for any guidance or assistance with troubleshooting.
Cheers, Bob
-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list