HA Nagios system

Andreas Ericsson ae at op5.se
Wed May 26 06:14:57 CEST 2004


Phil Manuel wrote:
> We have a requirement that our monitoring system has to be 99.995% or greater 
> availability.  has anyone run Nagios in this kind of environment? If so what 
> techniques have been used?
> 

We do that, with fully redundant servers. The only real difference 
between them is that the slave-server doesn't send notifications (both 
perform active checks, so everything is always as up-to-date as it could 
possibly be).

The two servers have 2 NIC's, ofcourse, and are interconnected with a 
crossed CAT5 to keep notifications from going haywire if the switch goes 
down.

For config-synchronization we have a nifty nrpe check. It works in 3 
steps which all short-circuit config-update;
1. Is MD5 checksum of config-dir same as last time we checked?
2. Does nagios -v nagios.cfg return OK?
3. scp -r ... ; restart nagios

Step 2 above is quite necessary, or we might find ourselves copying a 
configuration which is in the midst of being altered.
This technique is also happily being applied to plugins and so on.

If nagios (or one of the servers goes down), the other one automatically 
kicks in, turns master (and enables notifications if it wasn't master 
earlier). The event-handler for this also sets a passive service to 
CRITICAL, so that notifications for the 'other server is down' event are 
sent out.

> I was trying to think of using a Linux cluster, that shared a SAN disk array.  
> Am I going down the right lines ?
> 

Not necessarily. You might want to think about what happens when the 
link to the SAN array goes down (nagios cmd-pipe will fail, as will 
logging and just about everything else).

> Phil.
> 

-- 
Sourcerer / Andreas Ericsson
OP5 AB
+46 (0)733 709032
andreas.ericsson at op5.se


-------------------------------------------------------
This SF.Net email is sponsored by: Oracle 10g
Get certified on the hottest thing ever to hit the market... Oracle 10g. 
Take an Oracle 10g class now, and we'll give you the exam FREE.
http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list