Continuing issues with retention file causing schedule/actions to be ignored.
Eli Stair
estair at ilm.com
Wed Mar 8 21:12:28 CET 2006
I'm continuing to have problems when retention.dat file gets into a
state where the nagios process stops functioning properly. The problems
I've had in the past were increasing numbers of hosts or entire
hostgroups no longer executing their service checks, and now (today)
that the event handler for one particular service stopped being executed
(while all others continue to work).
In this and all previous cases, stopping nagios and moving the retention
file out of the way resolves the issue. Reloading or a hard stop/start
of nagios doesn't have any effect. There has never appeared to be
anything "wrong" with the retention file.
The only issues with my installation are this issue, and the
all-too-frequent "premature end of script headers" in all the CGI's, and
"Warning: Size of service_message struct (528 bytes) is >
POSIX-guaranteed atomic write size (512 bytes). " due to compiling
x86_64. That being said, I have enough issues that there dozens of
daily "premature script header/Internal Server Error" wreaking havoc
with production, and these instances of event failures that are
extremely critical. The script header problem came into being
immediately upon upgrading from 2.0b6 to 2.0rc2+, and the
scheduling/retention problem has been present to varying degrees in
every 2.0b+ I've tried.
I am happy to find these are configuration/optimization issues on my end
I can resolve, but my suspicion is they are bugs. I will do anything I
can to help provide a debug testbed for identifying and tracking them
down. Attached is my main nagios config (objects are not included), and
I can provide any other data (object configs, logs, retention.dat, etc)
privately if needed (security concerns).
Please let me know what I can do to help address this and find a resolution.
Regards,
/eli
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
More information about the Developers
mailing list