Advice on suitability of Nagios ans NSCA for a blended centalised/distributed model

Chris Holt Chris.Holt at london2012.com
Wed Jun 2 10:17:33 CEST 2010


Hi, I'll start by admitting n00bness but I have googled a lot about nagios and I hope that this will not duplicate a previous mail.  Be kind!

I am looking to create a monitoring model that does not cleanly fit into anything I have seen, and before I spend days getting it all to (not) work I wanted to validate my plans.  Basically, I have a lot of remote, temporary events sites popping up on ADSL or behind multiple NATs (ie outbound access to the Internet only) and going after a few weeks.  On site infrastructure will go up very quickly and come down equally quickly [1].

What I want to be able to do is have a layout so that:

-          I can access a view of the local site devices site from a local event server and a view of all the sites from the central server

-          All the polling happens from the local site servers and the central server only pings the external ADSL IPs of each site to check if they are alive

-          All alerts are sent out from the central server via email, sms etc

-          Most checks will just be pings or receipt of syslog/snmp traps from local devices

I am assuming I will need to play with NSCA and have the local server doing active checks, exporting stats to a file, and the central server doing passive checks, with NCSA syncing the stats to the central server.  However, ADSL polling stats for each site need to be synced back to the local servers if they are available so the on site view shows those stats too

Some things I am not sure about though: a) the line above    b) that this is possible with only outbound access to the Internet from the local server   c) how management of configured devices is kept in sync between the local and central servers   d) how the central server can have suppression/correlation of events so that a remote site being down shows just "site X down" instead of 100 alerts about each element of the site being down

The key part of this is that there can be really complicated parts to the setup, but the day to day operation must be as simple as possible, ie I can't expect users to have instructions like "log into linux, use vi to change configs, kill sighup  and use rsync to the manager to update it"

Is this just far too complicated an idea?

Thanks in advance

--Chris

[1] on a side note on advice on how to best to get very simple non technical users to be able write monitoring configs would be appreciated
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20100602/d4da166f/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------

-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


More information about the Users mailing list