"Return code of 141 is out of bounds" Error in Nagios 3.2.3
Allan Clark
allanc at chickenandporn.com
Mon Jun 20 04:59:58 CEST 2011
>> On Mon, Jun 20, 2011 at 9:24 AM, Rai Ricafrente <maillist at ricafrente.com>
>> wrote:
>> > Hi everyone,
>> >
>> > I just installed a fresh Nagios v3.2.3 with about 150 hosts and 600
>> > services. I just noticed from time to time, hosts are throwing out
>> > "Return
>> > code of 141 is out of bounds" status every now and then, then it will
>> > eventually go away. I don't know if this has anything to do with the
>> > plugin
>> > since the status will return to OK state without intervention, which
>> > proves
>> > that the check_icmp plugin works just fine.
>> >
>> > I'm confused with this error, and this one did not manifest itself when
>> > we
>> > were using Nagios v2. Anyone has the same issue?
>> >
>> > Big thanks,
>> >
>> > Rai
> On Mon, Jun 20, 2011 at 10:16 AM, Yueh-Hung Liu <yuehung.liu at gmail.com>
> wrote:
>>
>> nagios only accepts integers 0~3 as return codes of plugins.
>> try to manually execute the command of the questioned service (be the
>> user nagios runs as) and check the ouputs.
On Sun, Jun 19, 2011 at 19:24, Rai Ricafrente <maillist at ricafrente.com> wrote:
> The output returns OK status when run manually. It seems that the error
> occurs at random times, but as mentioned, will eventually go away. If the
> plugin is the issue, the error should be persistent. In my case, it happens
> from time to time. I only experienced this when we used Nagios 3.2.3, this
> never happened in Nagios v2.6
(Quick reminder: mailing list: don't top-post)
Rai, the logic of "it never happened before on 2.6 so it would have
never happened on 2.6, therefore 3.2.3 is in error" is like "we've
never had an oil rig explode in the Gulf of Mexico before" :)
Really, the way to find out who is to blame is similar to Yueh-Hung
Liu's suggestion, but make a wrapper for the script instead. The
wrapper should record the environment offered to the script, and the
parameters, and should check the return code, storing the results by a
filename based on the result code -- for example, renaming a temporary
file used to collect into a filename based on the result. An example
in /bin/bash would be to store all content into a file
/tmp/nagios-tmp.$$, and then based on the $0 of the script execution,
"mv /tmp/nagios-tmp.$$ /tmp/ret.$0" or some such.
To explain what this offers, consider that you may have the return
codes 0,1,2,3, and 141, and you're using "/tmp/ret" as a base
filename.
When you're running again, and you have a few successful results plus
a "141" return code, compare any of the /tmp/ret.0, /tmp/ret.1,
/tmp/ret.2,/tmp/ret.3 with the /tmp/ret.141 contents. You only need
to keep the last occurrence of each (since they should be similar) so
it keeps you from running out of disk. You can run this overnight
without crushing your monitoring system's disk, no huge difference
except for the file I/O you've added.
Then, when you compare the wrapper output in the 141 case to the 0-3
case (ie "diff /tmp/ret.0 /tmp/ret.141"), you'll see whether the input
environment or parameters are different.
If it's relatively the same input either way, then when the wrapper
executes the wrapped script, perhaps turn on some debugging or
tracing, and the output will still collect, but you'll have some
verbose debug information to dig through to see why. Alternatively,
if the input seems to change, you'll be able to see what Nagios is
doing differently between executions.
Allan
--
allanc at chickenandporn.com "金鱼" http://linkedin.com/in/goldfish
------------------------------------------------------------------------------
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list