HA Nagios system
Andreas Ericsson
ae at op5.se
Wed May 26 06:14:57 CEST 2004
Phil Manuel wrote:
> We have a requirement that our monitoring system has to be 99.995% or greater
> availability. has anyone run Nagios in this kind of environment? If so what
> techniques have been used?
>
We do that, with fully redundant servers. The only real difference
between them is that the slave-server doesn't send notifications (both
perform active checks, so everything is always as up-to-date as it could
possibly be).
The two servers have 2 NIC's, ofcourse, and are interconnected with a
crossed CAT5 to keep notifications from going haywire if the switch goes
down.
For config-synchronization we have a nifty nrpe check. It works in 3
steps which all short-circuit config-update;
1. Is MD5 checksum of config-dir same as last time we checked?
2. Does nagios -v nagios.cfg return OK?
3. scp -r ... ; restart nagios
Step 2 above is quite necessary, or we might find ourselves copying a
configuration which is in the midst of being altered.
This technique is also happily being applied to plugins and so on.
If nagios (or one of the servers goes down), the other one automatically
kicks in, turns master (and enables notifications if it wasn't master
earlier). The event-handler for this also sets a passive service to
CRITICAL, so that notifications for the 'other server is down' event are
sent out.
> I was trying to think of using a Linux cluster, that shared a SAN disk array.
> Am I going down the right lines ?
>
Not necessarily. You might want to think about what happens when the
link to the SAN array goes down (nagios cmd-pipe will fail, as will
logging and just about everything else).
> Phil.
>
--
Sourcerer / Andreas Ericsson
OP5 AB
+46 (0)733 709032
andreas.ericsson at op5.se
-------------------------------------------------------
This SF.Net email is sponsored by: Oracle 10g
Get certified on the hottest thing ever to hit the market... Oracle 10g.
Take an Oracle 10g class now, and we'll give you the exam FREE.
http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list