FW: Using service dependencies
Greg Vickers
g.vickers at qut.edu.au
Tue Nov 15 01:23:52 CET 2005
Hi Deborah,
(I assume you are using v2.x)
Deborah Martin wrote:
> I'm trying to use service dependencies as part of my nagios config.
>
> Basically, I want to do 2 checks, the primary check connects to a
> database. The secondary check is for RAM usage.
> However, if the primary check fails I don't want nagios to try and
> attempt the secondary check as there is no point if
> a system is down. I still want to see Nagios alert via the web front-end
> but not be notified via email when the primary check
> is in a Unknown or Critical state.
I would ask: Why bother with this dependency at all?
Sequencial scenario:
0) Your DB check runs, all OK.
1) The DB crashes.
2) The RAM check dependency is checked, the DB service last reported OK.
3) The RAM check is executed and returns OK (unless your RAM check
really does depend on the DB somehow??)
4) Your DB is checked later and returns UNKNOWN or CRITICAL.
5) Then NEXT time your RAM check runs, it will not run because the DB
service is UNKNOWN or CRITICAL.
Normal operation of Nagios would go like this:
0) Check services on a host.
1) As soon as ANY service on a host changes to a non-OK state, *that
host status is checked* and *no other service checks on any hosts are
executed* until the state of that host is determined. This is why host
checks have to execute quickly - no other service checks will be
processed/executed until host status is determined.
2) If the host is in a non-OK state, *no notifications for services on
that host are sent out until that host comes back up*
(So if all services fail but the IP still responds to your check_host
command (ping or ICMP) you will receive many service notifications.)
<quote> ... I don't want nagios to try and attempt the secondary check
as there is no point if a system is down.</quote>
Fair enough. If you leave the RAM check non-dependant, *your host
downtime will decrease* (according to Nagios) because there are more
services to run on a host and a host state change from CRITICAL to OK
will be detected quicker when there are fewer dependant services.
Since the RAM and DB checks aren't really dependant, having this
dependency doesn't bring any value (IMHO) to your monitoring and will
increase the amount of time a host spends in a DOWN state, since Nagios
schedules and executes service checks for a DOWN host (increasing the
number of times during a given time period that Nagios can discover a
host state change) while suppressing notifications.
Service dependancies are usually used for truly dependant services, i.e.
a db that serves content for a webserver. If the db goes down, don't
check the webserver, as the content will not be accessable by the webserver.
Of course you may stil get a situation like the first example where the
checks are executed in such an order that will not catch the db failure
before the webserver is checked, and you get a notification about the
webserver failure before the db failure.
If you have this dependency, you generate extra config for you or others
to manage. (This may be fine :))
I would not implement this dependency as RAM status doesn't really
depend on the db availability (really the other way around, the db
depends on the amount of RAM available, depending on your situation.)
> (The plugins on their own all work fine thru nagios as do they via the
> command-line. )
>
> I've defined below what I understand to be the way to configure
> dependencies but i'm not convinced its right even though
> it works fine. Could someone take a look and just sanity check this for
> me and let me know if i'm doing the right/wrong thing ?
<snip>
> # device1 dependency checks
> define servicedependency{
> host_name ngcp4
> service_description DB RAM checks for device1
> dependent_host_name ngcp4
> dependent_service_description device1 DB Check
> execution_failure_criteria u,c
> notification_failure_criteria u,c
> }
You've got it back-to-front; the above config makes the 'device1 DB
Check' service /depend on/ the 'DB RAM checks for device1' service.
The DB service check will not run, nor will notifications be sent out
for this service, when the RAM service check is UNKNOWN or CRITICAL.
Your first paragraph states that you want the RAM check to depend on the
DB check.
We use dependancies on our mail server as there are many dependant
services (IMAP, POP, SMTP) that depend on many other components of the
mail server - this is the only situation where we use dependancies. (I
needed a large beer after figuring out that config...)
In conclusion, I would remove your dependency. But that's me :)
HTH,
--
Greg Vickers
Project Manager, IT Security
Information Technology Services
Queensland University of Technology
L12, 126 Margaret St, Brisbane
Phone: (07) 3864 9536
Mobile: 0410 434 734
Email: g.vickers at qut.edu.au
IT Security web site: http://www.its.qut.edu.au/itsecurity/
CRICOS No. 00213J
-------------------------------------------------------
This SF.Net email is sponsored by the JBoss Inc. Get Certified Today
Register for a JBoss Training Course. Free Certification Exam
for All Training Attendees Through End of 2005. For more info visit:
http://ads.osdn.com/?ad_id=7628&alloc_id=16845&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list