Nagios and Cacti
Max
perldork at webwizarddesign.com
Thu Apr 9 17:31:54 CEST 2009
Hi Chris, Daniel,
I write about a number of the configuration decisions we made in order
to achieve our current level of performance on my blog:
http://www.semintelligent.com/blog/?q=Nagios
Please note that a number of configuration steps we have done go
against what the Nagios documentation recommends, so if you wish to do
anything similar to what we have done, make sure you understand the
Nagios documentation and understand the risks of violating the
recommendations in it.
We have done a lot of custom development to help make implementing
SNMP-based checks across a large number of hosts easier for us:
1) We develop agent-specific checks (we currently use Net-SNMP and
SysEdge, starting to do Cisco monitoring) using perl that run clean
under ePN. These groups of checks are associated with host groups
specific to each agent type (e.g. net-snmp-host).
2) We create a custom base template for each agent type. The
template has custom attributess that associate SNMP version, community
string etc with the host template. We also use custom attributes in
each agent-specific check (e.g. CPU), so that all thresholds are
defined at the host level and we can provide default thresholds. For
example
define host {
name net_snmp_host
hostgroups +net_snmp_hosts
__snmp_version 2c
__snmp_community myreadonlycommunity
__snmp_port 161
__snmp_version 2c
__snmp_storage_partitions all
__snmp_storage_warn 90
__snmp_storage_crit 95
__snmp_la_warn 15:10:5
__snmp_la_crit 30:20:10
__snmp_mem_warn free,lt,8
__snmp_mem_crit free,lt,5
__snmp_swap_warn 50
__snmp_swap_crit 65
__snmp_cpu_warn wait,gt,20
__snmp_cpu_crit wait,gt,30
...
register 0
}
for custom communities we create separate templates, e.g.
define host {
name southwest-region-host
hostgroups +southwest-hosts
__snmp_community southWestRegionCommunity
}
so now our end users can easily tell Nagios to poll their hosts with
SNMP and they can override our thresholds if they want at the host
level without having to know a thing about programming:
define host {
use generic-host, net_snmp_host, southwest-region-host
# Override CPU default thresholds
__snmp_cpu_warn wait,gt,40
...
}
3) We have developed, and hope to release sometime this year, a
perl-based, ePN friendly SNMP check script that handles counters and
gauges well, it lets you check multple SNMP OIDs at once. This has
been extremely useful for custom SNMP application agents .. a service
definition ends up looking like this:
define service {
use check_snmp_oids-base
service_description Custom App - 5 minute SNMP checks
__snmp_oids_spec -O 'TimeMin:g:1.3.6.1.4.1.1900.5.5.2.2.1.0' \
-O 'labelFor1sttOid:g:1.3.6.1.4.1.9999.1.3.0' \
-O 'labelFor2ndOid:g:1.3.6.1.4.1.9999.1.4.0' \
-O 'labelFor3rdtOid:g:1.3.6.1.4.1.9999.1.5.0'\
__snmp_oids_crit_spec labelFor1stoid,lt,0
hostgroup_name custom-agent-group
servicegroups custom-service-group
}
In some cases we check 15-20 OIDs at once using this methodology.
Our script uses memcached to cache counter data to get delta output
properly and we have code that adjusts data properly for over samples,
under samples, and large deltas.
Many of our checks are based off of the code I wrote that can be
downloaded here:
http://www.nagios3book.com/nagios-3-enm/checks/
Though we have significantly enhanced things.
So, a lot of development time up front but the end result is we get
terrific performance and a lot of flexibility. We are using Nagios to
replace $$$ COTS products, so our company is happy to have us spend
time doing custom development. I realize many of you do not have that
luxury so I understand that this won't be ideal for many of you.
sorry.
Development time with two people to get to where we are now - about 3-4 months.
We have permission to release a lot of the code we have done, just
need time to package it properly for a public release .. so hopefully
we can share some of our tools and help others do something similar
without the 3-4 months development time :p.
hope this helps more than it confuses.
- Max
------------------------------------------------------------------------------
This SF.net email is sponsored by:
High Quality Requirements in a Collaborative Environment.
Download a free trial of Rational Requirements Composer Now!
http://p.sf.net/sfu/www-ibm-com
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list