service dependencies & recovery
Tom De Cooman
tom.decooman at ugent.be
Mon Sep 15 12:56:00 CEST 2008
Using a lot of checks through NRPE for all hosts, we implemented a
general check_nrpe, one that just checks wether NRPE is listening on the
host, and let all the other checks using NRPE depend on this one.
Service dependency example:
define servicedependency{
host_name servername
service_description Check NRPE
dependent_service_description Load
execution_failure_criteria o
notification_failure_criteria w,u,c
}
The service dependency does work as, when the NRPE daemon goes down, we
only get a message saying that 'Check NRPE' is down. The other checks go
critical but no notification is sent out.
Problem is that when the NRPE on the host-to-be-checked is functional
again, we notice that some of the checks where a service dependency has
been applied do not recover, they remain in the 'critical' state.
The check returns to an OK-state only after we manually scheduled an
active check for it.
Nagios version we are using is 3.03
Same issues with previous versions noticed (3.01)
Anyone encountered similar issues?
Posted already on user-list, will try to do some more investigation
myself.
--
Tom De Cooman <tom.decooman at ugent.be>
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
More information about the Developers
mailing list