A question...cascading failures and failure to recover
Steven Schwartz
sschwartz at gracenote.com
Sat Feb 17 00:25:03 CET 2007
>>n Fri, 16 Feb 2007, Steven Schwartz wrote:
>
>> I've noticed an odd circumstance on two of my four nagios servers
>> lately, and searching has found me no answers. Has anyone experienced
>> symptoms similar to these:
>>
>> 1) On a given server, a plugin produces a "critical failure" on many
>> (sometimes all) of the systems using that particular plugin.
>>
>> 2) Tests by hand of said plugin produce an "OK" result.
>>
>> 3) The system does not acknowledge the service having recovered until
>> checks are rescheduled by force, and then execute OK.
>>
>> Does this ring bells with anyone?
>There are a lot of circumstances that could cause something like this,
>from a bad plugin, to issues with embedded perl, to network issues, to
>incorrect file permissions or environment.
Ah. Hm. Embedded perl is definitely worth a look, as one of them (upon
rechecking, the problem's happened with two plugins) is a perl script,
though the other is simple, ugly Bourne shell.
As to a bad plugin, would that I could ask for help there, but it's a
homebrew plugin to monitor proprietary code, so there's not much help I
can ask for there.
>There's just not nearly enough info here to have an idea where to start
>looking, though.
Anything in particular I could help provide, then?
-rwx------ 1 nagios nagios 401 Jan 11 15:46 s3test
Thank you for the suggested places to look, regardless...
Steven Schwartz
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list