how to use servicedependency?

Aaron Devey adevey at omniture.com
Thu Sep 27 05:09:29 CEST 2007


If I am reading your question right, the dependency works, but currently
you get alerts for sv1.dummy1 AND sj2.router1, and you only want alerts
for sj2.router1.  If this is the case, you could try setting up
sv1.dummy1 so that it doesn't alert.  Unfortunately, you might run into
problems with getting sj2.router1 to recognize a recovery if sv1.dummy1
recovers first.  You could try a circular dependency (and I'm not even
sure if you can do that in nagios) where sj2.router1 only runs if
sv1.dummy1 is failing, and sv1.dummy1 only runs if sj2.router1 is
passing.  But then you might get a problem where neither check runs
because sv1.dummy1 is passing, and sj2.router1 is failing.

This is a difficult problem to solve with service dependencies. 
Basically you want <service> to go critical if both <check A> AND <check
B> fail.  But recover if either <check A> OR <check B> pass. 
Unfortunately, the way your service dependency works, the status of
<service>  is directly tied to the status of <check B>.  And <check B>
never updates if <check A> passes.  So you really need <service> to
determine the status of both checks and alert accordingly, or you need
an event handler for <check A> to submit an 'OK' status for <check B>
when it's passing.

The first of those two options is definitely the easiest.  It simply
consists of a small shell script that runs <check A> and if <check A>
fails, returns the status of <check B>.  Consider a script such as the
following:

#!/bin/bash
CHECK_ONE="/path-to-checks/check_ping -H $1 -t 2 -p 2 -w 500,50% -c 999,99%"
CHECK_TWO="/path-to-checks/check_ping -H $2 -t 2 -p 2 -w 500,50% -c 999,99%"

if $CHECK_ONE >/dev/null 2>&1; then
  echo "Check one OK."
  exit 0
else
  exec $CHECK_TWO
fi

Replacing your own check commands in CHECK_ONE and CHECK_TWO of course. 
The first one would be the equivalent of your "check-link" command.  The
second would be the equivalent of your "check_nrpe!check_router1"
command.  Note that in this case I used $1 and $2, so the first argument
to the script would be the first host to check, the second argument
would be a second hostname.  You don't have to use arguments and could
just hard-code the values into your script, but it makes the script more
scalable if your installation grows.  The second check is ONLY executed
if the first one fails.

This way you only need one host, one service, and no dependencies.  If
you named your checkcommand "check_double" the service would be
something like:

define service {
        use service-template
        host_name sj2
        service_description sj2.router1
        check_command check_double!first_hostname!second_hostname
}

Good luck!

-Aaron


Jeremy C. Reed wrote:
>
> (I posed a couple weeks ago, but only got one response which was different
> than what I think I want to do.)
>
> I am running Nagios 2.9.
>
> I want: if a check_ping fails then I don't want an alert sent to me
> unless a second test (check_nrpe to a remote system that does the same
> check_ping) fails.
>
> I am reading http://nagios.sourceforge.net/docs/2_0/dependencies.html
> (I was looking at 3_0 last time.) And I am looking at
> http://www.linickx.com/blog/archives/271/how-to-monitor-wordpress-with-nagios/
>
> Where is execution_failure_criteria and notification_failure_criteria
> documented for 2.9?
>
> Can someone please provide an example of only sending a problem alert if
> two different check_commands fail and the second check_command is not done
> if the first one is OK?
>
> This is what I have:
>
> define service {
>                  use service-template
>         host_name sj2
>         service_description sj2.router1
>         check_command check_nrpe!check_router1
> }
>
> # The "dependent" is the object that needs something.
> define servicedependency {
>         dependent_host_name sj2
>         dependent_service_description sj2.router1
>         host_name sv1
>         service_description sv1.dummy1
> # o = fail on an OK state, the dependent service will not be actively
> # checked if the master service is in OK
>         execution_failure_criteria o
> #       notification_failure_criteria o
> }
>
> define service {
>                  use service-template
>            host_name sv1
>  service_description sv1.dummy1
>        check_command check-link
> }
>
>
> But I am getting two alerts if both don't return OK. I only want one
> alert. Also I am unsure how to use the execution_failure_criteria and
> notification_failure_criteria.
>
> And I do not want my "sj2.router1" to even be checked if the first
> "sv1.dummy1" is successful. But if sv1.dummy1 fails, then I want the
> sj2.router1 check to happen. And if it fails then send my alert.
>
>
>
>   Jeremy C. Reed
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2005.
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list