Nagios 2.0 Event Broker and DB Support
Ethan Galstad
nagios at nagios.org
Fri Aug 1 07:02:03 CEST 2003
Sorry for the crosspost, but the nagios-devel list is usually pretty
quiet when I request comments about new features I'm implementing.
This one is bigger than most, so I wanted to reach more people. This
is a bit long, so bear with me...
I am almost complete with coding for 2.0. Two big things remain: the
event broker and DB support (which is currently broken).
My original intent was to develop the event broker as a separate
application, tying it to Nagios with a unix domain socket. Nagios
would send the event broker information about the everything that was
going on (service checks, downtime, flapping, log entries, etc.).
The event broker would be able to load user-developed modules (object
files) at runtime and pass various types of Nagios data to them for
processing. This is all fine and good. I have a working prototype
of the event broker that does just this and seems to work okay. I
got to thinking that it was rather stupid to develop a separate
application for this when I could simply have Nagios load user-
developed modules itself. Doing this would give the modules the
benefit of having access to internal Nagios structures and functions
(which is good and bad - see below).
Here's an overview of how it would work:
- Nagios would load user-specified modules (object files) at startup
using the dlopen() function.
- Nagios would call the module's initialization function (the name of
which would be standardized).
- The module's init function would register for various types of
Nagios event data (service checks, host checks, log entries, event
handlers, etc.) using callback functions.
- When Nagios encounters an event for which a module has registered a
callback function, Nagios would call that module's function and pass
it data relevant to the event. The module is then free to do
whatever it wants to that event data. An example might be to log
service checks, performance data and log entries to MySQL, etc.
- Before shutting down, Nagios calls the module's de-init function.
This allows the module to clean up any resources it may be using.
Seems good in theory. Heck, might even work okay. However, there's
a big problem I have with it. If I implement things this way, the
user-developed modules would have access to internal Nagios data
structures and functions. This is not necessarily bad, as ill-
behaved modules would not be adopted by too many people. :-)
However, modules that might be compiled and working fine
for Nagios 2.0 might segfault under future versions if the internal
data structures change. Here's an example of what I mean:
User module registers for Nagios service check data using its
mymod_handle_servicecheck() function, which has a prototype of:
int mymod_handle_servicecheck(service *);
The service struct is an internal Nagios structure definition which
changes between Nagios versions. If the user module is compiled for
use with Nagios 2.0 and it's definition of the service struct, it
will have problems if it is not recompiled for future versions of
Nagios.
Off the top of my head, I could overcome this by requiring that the
user modules indicate (by calling a function) what version of Nagios
they are compiled for. If they report anything but the current
version (or do not report at all), unload them so they can do no
harm.
I'm afraid I'm a bit over my head on how to handle this one. Some of
you developers out there must have experience with this type of
thing. If so, how did you handle it? What would you recommend?
Comments, suggestions, flames? Is there a better way to accomplish
this? Speak up now.
What does this have to do with DB support, you ask? Well, if I
implement the event broker as I have proposed I will be yanking
native DB support out of Nagios completely. You can then write a
module to log to a DB if you want. :-)
PS: I had originally planned on exposing almost all of Nagios' data
and events to the broker, but I may have to scale that down if I plan
on getting 2.0 out this century. Perhaps just support for:
- Service and host checks
- Event handlers
- Log data
This would allow the development of modules to log check information,
performance data, and log file data to a DB (or whatever).
Ethan Galstad,
Nagios Developer
---
Email: nagios at nagios.org
Website: http://www.nagios.org
-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01
More information about the Users
mailing list