fun with (silent) change from HARD to SOFT state
Ethan Galstad
egalstad at nagios.org
Fri Jan 23 17:22:39 CET 2009
Michal Svoboda wrote:
> Hello,
>
> I've discovered a weird behavior, which can be replicated thus:
>
> 1. Let a service be configured for max attempts N before going to HARD
> non-ok state
>
> 2. Make the service fail and wait for N checks to pass (ie. until the
> service enters N/N HARD non-ok state); at this point notifications
> are sent, etc.
>
> 3. Change the configuration of the service to have M > N max attempts
> and restart nagios
>
> 4. Now the state of the service is N/M _HARD_ non-ok
>
> 5. If the N+1th check results in non-ok, then the service state goes to
> N+1/M _SOFT_
>
> 6. If some future check results in ok, then the service performs a SOFT
> recovery; this results at least in no recovery notifications
>
> 6a. if the condition in (5) does not occur, ie. the N+1th check results
> immediately in ok, the service still performs a SOFT recovery from
> an apparently HARD state (even according to the logs)
>
> Now, one way to look at this behavior is that it is logical, because
> I've fiddled with the config, and I can expect anomalies and blah blah.
>
> Another way to look at it is that there have been notifications sent in
> step (2), yet there are no recovery notifications; in other words, once
> the sirens have been sounded (and the fire brigade is on the way, and
> the president is being woken up), they should be also properly shut off.
>
> So the question is, whether or not introduce a patch that prevents
> entering a SOFT state once a service (or a host) is already in a HARD
> non-ok state?
>
>
> With regards,
> Michal Svoboda
Nice catch. I just added some code that will readjust current check
attempt at startup if the host/service was in a hard problem state.
That will accommodate config changes related to max check attempts that
are made before (re)start.
- Ethan Galstad
------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
More information about the Developers
mailing list