Nagios as a Service Resiliency Manager

gmartin gmartin at gmartin.org
Fri Dec 11 18:20:33 CET 2009


Chris, great thing about Nagios is it enables creative solution like
this. I'd love to see you try it and report back on how it works for
you.

On 12/11/09, Christopher McAtackney <cristoir at gmail.com> wrote:
> That's an interesting link - but unfortunately I don't think it really
> covers the situation where a host goes down or becomes unreachable. It
> may be the case that Nagios is not suitable for this purpose, but I
> thought I would check on here in case anyone had done anything like
> this previously.
>
> Cheers,
> Chris
>
> 2009/12/10 Marcel <mitsuto at gmail.com>:
>> Maybe this would help:
>> http://onlamp.com/onlamp/2006/05/25/self-healing-networks.html
>>
>> On Thu, Dec 10, 2009 at 3:08 PM, Christopher McAtackney
>> <cristoir at gmail.com>
>> wrote:
>>>
>>> Hi all,
>>>
>>> I have a need to control an Active / Passive pair of components and
>>> was wondering if anyone had tackled this problem with Nagios?
>>>
>>> The scenario is as follows;
>>>
>>> Host A has SERVICE_1 installed and running. Host B has SERVICE_2
>>> installed, but not running.
>>>
>>> The desired functionality is to detect when SERVICE_1 is not running
>>> (or that Host A is down / unreachable), and then to start SERVICE_2 on
>>> Host B.
>>>
>>> I believe I can do this with Nagios by defining an event handler on
>>> SERVICE_1 which will make the appropriate call to start SERVICE_2 on
>>> Host B
>>>
>>> Would it make sense to store the relationship between SERVICE_1 and
>>> Host B / SERVICE_2 as a service macro, e.g.
>>> $_SERVICE_PASSIVE_HOSTNAME, $_SERVICE_PASSIVE_SERVICENAME?
>>>
>>> There are too many scenarios in which the SERVICE_1 might come back up
>>> to try automate the switching off of SERVICE_2 I believe, e.g. if
>>> someone pulled a network cable on Host A accidently, then plugged it
>>> in 15 minutes later - during which time Nagios detects that it is down
>>> and so starts up SERVICE_2. The user then plugs the network lead back
>>> in and now we have two Active instances running - which is what we
>>> specifically wanted to avoid. Even if Nagios detects that the primary
>>> component is up, it's still too late because any Active / Active
>>> overlap will cause problems for this particular application.
>>>
>>> I can't think of any way to automate that side of things - but does
>>> the general concept of having Nagios start up a Passive partner make
>>> sense?
>>>
>>> Thanks for any insight you have,
>>>
>>> Chris
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Return on Information:
>>> Google Enterprise Search pays you back
>>> Get the facts.
>>> http://p.sf.net/sfu/google-dev2dev
>>> _______________________________________________
>>> Nagios-users mailing list
>>> Nagios-users at lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/nagios-users
>>> ::: Please include Nagios version, plugin version (-v) and OS when
>>> reporting any issue.
>>> ::: Messages without supporting info will risk being sent to /dev/null
>>
>>
>
> ------------------------------------------------------------------------------
> Return on Information:
> Google Enterprise Search pays you back
> Get the facts.
> http://p.sf.net/sfu/google-dev2dev
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting
> any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>

-- 
Sent from my mobile device

\\Greg

------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list