Nagios 2.3 internal server error.
Ethan Galstad
nagios at nagios.org
Tue May 16 21:55:37 CEST 2006
Try running the status CGI from the command line with a script. This
will ensure that Apache isn't causing any problems. Ideally, it would
be most useful to run the CGIs under GDB so you can do a trace when it
coredumps.
To run the CGIs from a command line, do the following:
[nagios at lanman ~]# setenv REQUEST_METHOD "GET"
[nagios at lanman ~]# setenv REMOTE_USER "username"
[nagios at lanman ~]# /usr/local/nagios/sbin/status.cgi
To run the CGIs under GDB, do this:
[nagios at lanman ~]# setenv REQUEST_METHOD "GET"
[nagios at lanman ~]# setenv REMOTE_USER "username"
[nagios at lanman ~]# gdb /usr/local/nagios/sbin/status.cgi
Ideally, you'd want to use the unstripped CGI binaries (located in the
cgi/ subdirectory of the Nagios source) for maximum debugging efficiency.
Eli Stair wrote:
>
> Yeah, I think it was that (or b2, have to check, just keep the dir
> symlinked for use) that was the last “stable” version for me, without
> the CGI’s crapping out with alarming regularity. I’ve actually seen the
> daemon die twice today (the typical ‘caught sigsegv, shutting down...’)
> since running 2.3 (not present in 2.2 for me).
>
> Any input from anyone anywhere on the cause? I still haven’t heard a
> peep in response other than that I’m not the only one this is happening
> to.... Offering to try and be of use doesn’t seem to be well regarded.
> I’m trying not to sound whiny, but this has been a fairly unresponsive
> project WRT acknowledging and fixing problems. I don’t know what more
> I can do, I can’t even find anyone who seems potentially interested in
> helping to throw money at :)
>
> I’m leaving 2.3 run overnight, doing a ps –jHF on it every second, maybe
> I’ll catch it in some bad act of wedging during some child process
> spawn, or amidst a mem leak phase right before it dies... It won’t have
> anything to do (I’d imagine) with the CGI’s dying, but maybe the cause
> of the segfaults... Not going to recompile and run the daemon in heavy
> debug until I get a hit that it’s wanted or useful to my cause.
>
> Thanks for the input Alessandro,
>
> /eli
>
>
> On 5/9/06 7:30 PM, "Alessandro Ren" <alessandro.ren at opservices.com.br>
> wrote:
>
>
> Eli,
>
> try to use nagios-2.0b4, it doest give any errors to me so far.
> Nagios 2.3 seems to be generating more errors in the CGIs than
> the previius 2.2.
> I will try to find a pattern on this and look in the code for
> memory leeks and th like.
>
> []s.
>
> Eli Stair wrote:
>
> Re: [Nagios-devel] Re: Nagios 2.3 internal server error.
> Sorry for the clutter, my earlier post was too optimistic...
> Total of 5000 requests via elinks for a host detail shows only 3
> 500 (internal server error) issues on the client side. At the
> same time, this generated 63 “Premature end of script headers:
> status.cgi” errors in the apache logs. Only one revealed the
> referrer URL:
>
> [Tue May 09 16:23:57 2006] [error] [client 10.73.16.108]
> Premature end of script headers: status.cgi, referer:
> https://monitor02/nagios/cgi-bin/status.cgi?hostgroup=deathstar-opteron-850-32G&style=overview
>
> During this period of testing (several hours, 5000+ checks with
> links, and 10 windows open with various UI views refreshing
> every 5 minutes), only this one verbose message, the rest to the
> effect of:
>
> [Tue May 09 16:23:52 2006] [error] [client 10.73.16.108]
> Premature end of script headers: status.cgi
>
> And through the entirety firefox received four 500 pages, links
> three, and yet 63 “premature” errors were generated by the CGI’s.
>
> Still looks broken, my bad for being excited. Will compiling
> with any of the debug flags set cause the CGI’s to output more
> useful info, or are they only for the nagios daemon as it seems
> to be?
>
> /eli
>
>
> On 5/9/06 2:31 PM, "Eli Stair" <estair at ilm.com>
> <mailto:estair at ilm.com> wrote:
>
>
>
> Hmm, missed this.
>
> Gave this code a shot and am still seeing the problem,
> though it seems at a _MUCH_ lower rate of frequency. Of
> 1200 hits I got only 3 500’s returned by the client
> (previous failure rate was around 1:40). The other
> significant change is I’m not seeing segfaults reported by
> the CGI’s, nor the “premature end of script headers” message
> in the apache logs that used to correspond with these 500’s.
> I’m guessing this fixed the major problem and the symptoms
> of it (segv’s, script header issue)...
>
> Unless something else was changed, I’d say it’s “mostly
> fixed”, or at least better, likely due to the content_length
> issue?
>
> Thanks devs, I had actually given up hope that this would be
> tracked down and addressed. This is great news
> (pre-emptively). No idea if this was randomly spotted or
> someone went looking for it due to my (and others’) reports,
> but either way I appreciate it.
>
> Cheers,
>
> /eli
>
> 2.3 - 05/03/2006
> * Bug fix for negative HTTP content_length header in CGIs
>
>
> On 5/9/06 5:22 AM, "Alessandro Ren"
> <alessandro.ren at opservices.com.br>
> <mailto:alessandro.ren at opservices.com.br> wrote:
>
>
>
>
> I've updated to nagios 2.3 and I am still getting
> the internal server error from time to time in the CGIs
> refresh.
> Eli, have you tried the 2.3 already?
> Just to let the list know.
>
Ethan Galstad,
Nagios Developer
---
Email: nagios at nagios.org
Website: http://www.nagios.org
-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
More information about the Developers
mailing list