Wish: Multiple instances of alerts on the same service/host
Marlo Bell
BELLMR at ldschurch.org
Mon Mar 19 16:43:39 CET 2007
I didn't quite understand the disk example. But a simple and popular answer to the trap issue is a program called SNMPTT.
Have SNMPTT catch all the traps you send and log them to a mysql database. Then have a "SNMP Traps" service actively check the database for all new traps for the associated host.
I modified SNMPTT's DB just slightly--adding an "acknowledge" column and then wrote a simple php page which would allow a user to acknowledge the one/many traps for that particular host.
I'll leave SNMPTT for you to get set up, but I'll include my simple plugin and php page. Feel modify them to meet your needs/ version of SNMPTT. I recommend using the php page as the "action_url" for the trap service passing in the $HOSTNAME$ macro as a GET parameter. These are both pretty quick and dirty, so don't be too critical.
Good luck.
Marlo
>>> Ståle Askerød Johansen <s.a.johansen at usit.uio.no> 3/19/2007 5:51 AM >>>
(This may appear twice. I fumbled with my subscription confirmation)
Here at the University of Oslo we are currently running Nagios
alongside our current monitoring system in order to check if
Nagios suits our needs.
So far, we are very happy with most of what we see. However, we
also consider using Nagios (with some suitable www-interface) as
our primary alarm console. This means that we will want to feed lots
of passive checks into Nagios from several other systems.
Let me give you an example:
- we want to forward SNMP-traps to Nagios from the management cards of
our Dell and HP servers.
- we setup our trap-receivers to submit this through NSCA.
- on the nagios server, we define the service "snmp trap" on all the
relevant hosts. the service is volatile and not active.
- we test.
- the hardware sends for instance "Fan 2 not OK". Nagios receives this
as a critical event. let's pretend the operator uses some time to fix this.
- in the mean time, the hardware on the same host sends for instance
"battery needs replacement". Nagios receives this as a critical event,
but the previous event if NO LONGER visible in the interface.
Some may argue that we need to make separate services for each type of
trap we want to receive, but sheer numbers make this not very elegant.
We need a way to tell Nagios that "this service is of a special kind
whose events should not replace each other as they are received". This
will make it easier to use Nagios and a suitable web-gui as a central
alarm receiver without adding thousands of new services.
The same problem also makes it difficult to make, for instance, a plugin
that monitors all userdisks on a host and reports to a service
"userdisks", since the events will overwrite each other.
Has anyone else thought of this? Is it difficult to implement? Are we
wrong in assuming that this is impossible with the present Nagios? Have
we misunderstood completely? Is it a stupid and childish idea? :-)
--
Ståle Johansen, sysadmin, University of Oslo, Norway.
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nagios-devel mailing list
Nagios-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-devel
----------------------------------------------------------------------
NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20070319/839df3b2/attachment.html>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20070319/839df3b2/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: check_traps.pl
Type: application/octet-stream
Size: 1307 bytes
Desc: not available
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20070319/839df3b2/attachment.obj>
-------------- next part --------------
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
-------------- next part --------------
_______________________________________________
Nagios-devel mailing list
Nagios-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-devel
More information about the Developers
mailing list