Really slow log2ndo
Andreas Ericsson
ae at op5.se
Thu Sep 18 12:20:20 CEST 2008
Benjamin Krein wrote:
> On Sep 18, 2008, at 4:04 AM, Andreas Ericsson wrote:
>
>> Mikael Fridh wrote:
>>> On Wed, Sep 17, 2008 at 7:46 PM, Andreas Ericsson <ae at op5.se> wrote:
>>>> Benjamin Krein wrote:
>>>>> Is there a reason log entries aren't split up into multiple
>>>>> fields in
>>>>> the NDO DB? It seems kind of silly to put the entire log line in a
>>>>> single field when there are very clear delimiters in the line.
>>> These are a few excerpts of the logentries;
>>>
>>> Auto-save of retention data completed successfully.
>> Junk.
>>
>>> LOG ROTATION: DAILY
>> Implementation detail junk.
>>
>>> CURRENT HOST STATE: bernicla2;UP;HARD;1;PING OK - Packet loss = 0%,
>>> RTA = 0.50 ms
>> Useful, but superfluous (especially with LOG_ROTATION: Daily).
>
> I don't understand this response. I see some important stats in there
> that I'd want to put on a graph. Not sure why you consider it
> superfluous.
>
Because it's CURRENT HOST STATE, which just re-iterates the latest
HOST ALERT, but does it on every log rotation. In other words, it's
superfluous for data analysis.
>>> ndomod: Successfully reconnected to data sink! 0 items lost, 88
>>> queued items to flush.
>> NEB junk. Will go to separate logfile in a not too far off future.
>>
>>> CURRENT SERVICE STATE: rt-130-238-128-160-27;PING;OK;HARD;1;PING OK -
>>> Packet loss = 0%, RTA = 2.77 ms
>>>
>> See comment for CURRENT HOST STATE entry.
>>
>>> There are only clear delimiters for some types of entries, not all.
>>> And if you do separate the types into different tables it will stop
>>> being a Log... A log is where you go to check historically what
>>> happened during a certain period of time. If you split it up in
>>> separate tables it will be more complex to get an excerpt on just
>>> that
>>> - all events during a certain period.
>
> Well, each of the various log types have clear delimiters. Doesn't
> seem like the logic involved in deciding which items belong in which
> field based on the log type would be that difficult.
>
You're contradicting yourself. No, it's not hard to determine where
each logentry should go. Nagios does it automatically when issuing
its callbacks.
>>>
>>>> I imagine the current structure was designed to facilitate
>>>> displaying
>>>> the entire log though. Your guess is as good as mine, as it was a
>>>> long
>>>> time ago I took a look at the ndoutils code.
>>> It's a log on disk, thus it's a log in the database. Atleast that's
>>> my
>>> take on it.
>>> I would use MyISAM MERGE tables or PARTITIONS for this if you want to
>>> keep an "online" archive of the logs.
>>> With the MERGE trick you can compress old tables and rejoin them in
>>> the merge periodically if you'd like.
>>>
>> Or one can just print the logfiles from disk. They're already
>> partitioned
>> into rotated logfiles so it's not that much of a chore.
>>
>>>>> I'm contemplating writing my own parser for the archived logs,
>>>>> but I'm
>>>>> tempted to modify the NDO code to make use of multiple fields.
>>> You could modify the ndo code to recognize more event types. There
>>> are
>>> a few logentry_type IDs that are duplicate it seems.
>
> My knowledge of C is pretty weak. I'll see what I can come up with.
>
Don't bother with the LOG_DATA type. Just use the other nebcallbacks
and insert its data into various tables. That'll be a whole lot
better actually.
>>>
>>> The event history of objects (services and hosts) is already in
>>> nagios_statehistory, so what else is there in the log that you could
>>> gain so much from parsing out into separate tables/fields?
>>>
>> Notifications. Program start and stop. Downtime start and stop.
>> Adding a *raw* log to the database just duplicates the log that's
>> already saved on disk, so it buys us absolutely nothing while not
>> taking advantage of the indexing that a database can offer. And
>> even if the logfiles would ever go away (not likely), it's totally
>> trivial to concatenate logentries from several tables when one
>> wants to view them.
>
> This is exactly my point. What's the point of using a DB if you're
> just dumping a log entry into it? The timestamps make sense for
> querying for specific log entries, but as Andreas said, they already
> live in logically rotated files on disk that are easy to grep. Since
> they're already delimited, it would really be more beneficial to be
> able to query on some of the other fields as well (ie, host names,
> service descriptions, states, etc.)
>
> My ultimate goal for using these log entries in the DB is to compile
> reports based on various metrics that are gathered by Nagios already.
> The basic trend reports in Nagios & things like NagiosGrapher are ok,
> but they aren't very flexible. Pulling that data from the DB would be
> far easier, but not in the format it's in now. As it is now, it's no
> different than just grepping files on disk.
>
http://www.op5.org/git/nagios/reports-module.git
http://www.op5.org/git/nagios/reports-gui.git
The code is already there. It's already opensource, and it's already
being used in production by our +300 customers.
If you want area graphs from it (as opposed to better, prettier and
more accurate availability reports) I suggest you take a look at the
Netways grapher. I believe you'll find it at http://www.nagiosforge.org
That uses NDOutils data-format though, so you won't get away from that
log2ndo stuff.
--
Andreas Ericsson andreas.ericsson at op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
More information about the Developers
mailing list