2.0 upgrade, passive checks problem

Cott Lang cott at internetstaff.com
Wed Jan 4 04:29:37 CET 2006


I've had passive checks working for several years in 1.x, with never a
hitch.

Unfortunately, that all changed when I upgraded to 2.0.   Suddenly,
freshness checks never occurred. Ever.

I've re-read the docs several times, my config seems okay. My first
problem seems to be that my "check_period" under 1.x was always "none",
which worked fine.

If I change it to "24x7", I start getting freshness checks. However,
they seem to totally ignore freshness_threshold and use the
normal_check_interval.  If I comment out freshness_threshold and define
normal_check_interval to what I want, I seem to get random values.

i.e., a service set to 10 minutes tells me this:

[1136344804] Warning: The results of service 'x' on host y are stale by
12 seconds (threshold=821 seconds).  I'm forcing an immediate check of
the service.

Where'd 821 seconds come from?

Worse, I have other services with nearly identical definitions that
don't indicate they are stale or that a freshness check is being
scheduled, but suddenly go critical:

[1136344634] SERVICE ALERT: host;service;CRITICAL;HARD;1;CRITICAL:
service success not reported

The normal_check_interval is set to 2 hours, but it seems to go critical
every ~10-15 minutes. 

I'm at a loss at this point, I can only "kinda" get passive checks
working. It seems like I must be missing something obvious here in the
2.0 upgrade, but I'm befuddled.  I've been using a template for all my
passive services like this:


define service {
  name                          passive-service
  active_checks_enabled         0       ; Active service checks are
enabled
  passive_checks_enabled        1       ; Passive service checks are
enabled/accepted
  parallelize_check             1       ; Active service checks should
be parallelized
  obsess_over_service           1       ; We should obsess over this
service (if necessary)
  check_freshness               1       ; Default is to NOT check
service 'freshness'
  notifications_enabled         1       ; Service notifications are
enabled
  event_handler_enabled         1       ; Service event handler is
enabled
  flap_detection_enabled        1       ; Flap detection is enabled
  process_perf_data             1       ; Process performance data
  retain_status_information     1       ; Retain status information
across program restarts
  retain_nonstatus_information  1       ; Retain non-status information
across program restarts
  max_check_attempts            1
  normal_check_interval         1560    ; 26 hours
  retry_check_interval          1
  is_volatile                   0
  check_period                  24x7
  notification_interval         15
  notification_period           24x7
  notification_options          w,c,r
  ; freshness_threshold         93600   ; 26 hours  appears useless!
  register                      0
}

Any help would be appreciated!

thanks!





-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20060103/f99cf27a/attachment.html>


More information about the Users mailing list