Nagios dependency question
John Maddalozzo
john at journyx.com
Thu Dec 26 19:42:45 CET 2002
On Thu, 26 Dec 2002, Scott Whitney wrote:
> Background:
> a) Nagios runs "here"
> b) There is a router "here"
> c) It goes across the Internet to my coloc site (call it "there")
> d) There is a router "there"
> e) For the purposes of this example, there is 1 "machine" there
> f) "machine" runs httpd
> g) this httpd is shared for all web apps on the box, of which there are 55
> h) I have a script which checks the status of this web app.
>
> Here's my problem. When the router, here, is down, I get 59 messages. That
> is, router "here", router "there", machine ping, machine httpd + 55 sites.
No, actually you don't get the "site down" messages, but you do (or
did) get the "machine down" messages. At least that is what was happening
the other day.
I already put in an additional dependency ( I think it is on the
coloc-router) for the machine notifications. I think this is covered now,
but I haven't tested it. Should be able to test it by unplugging the T1.
Note there are two types of dependencies. host and service. I added the
service dependency. The host dependency was already there.
I don't understand your math below.
>
> I can solve this using dependencies, but here's my question.
>
> For the dependencies to work properly, each of the sites must be dependent
> on:
> a) httpd
> b) ping machine
> c) ping router "there"
> d) ping router "here"
>
> Let's assume I check this every minute. My math says that this is roughly
> 280 hits on httpd per minute (55 * 5 + 5), 280 pings to the machine per
Huh? What's the *5? and +5?
> minute, 280 pings to the router there per minute and 280 pings to the router
> here per minute.
>
> This gets a little worse when you realize I actually have over 200 sites,
> not 55. Also on 7 boxes, not one, so we're looking at more like 1005 per
> minute, spread unevenly across several boxes.
>
> The question, then, is whether anyone has run into this and/or does Nagios
> take this into consideration via any caching mechanism? The documentation
> says
>
> "Before Nagios executes a service check or sends notifications out for a
> service, it will check to see if the service has any dependencies. If it
> doesn't have any dependencies, the check is executed or the notification is
> sent out as it normally would be. If the service does have one or more
> dependencies, Nagios will check each dependency entry as follows:
> Nagios gets the current status* of the service that is being depended upon.
> "
>
> * by default this is the current HARD state
I presume either out of the state file, or it's image in shared mem.
>
> So...from where is it getting this information? Further perusal through the
> theory section helps me not at all...
>
> Anyone have ideas on this?
>
> Thanks,
>
> Scott Whitney
> swhitney at Journyx.com
>
--
__________________________________________________________________
Web-Based Project Management and TimeSheet Software
Journyx Timesheet http://www.journyx.com
John Maddalozzo, V.P. Engineering - john at journyx.com (512)833-3274
------------------------------------------------------------------
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
More information about the Users
mailing list