Bad File Descriptors
Mohr James
james.mohr at elaxy.com
Wed Jul 23 15:09:21 CEST 2008
> -----Ursprüngliche Nachricht-----
> Von: nagios-users-bounces at lists.sourceforge.net
> [mailto:nagios-users-bounces at lists.sourceforge.net] Im
> Auftrag von Ryan Steele
> Gesendet: Mittwoch, 23. Juli 2008 15:04
> An: nagios-users at lists.sourceforge.net
> Betreff: [Nagios-users] Bad File Descriptors
>
> Hey folks.
>
> I recently had a co-worker present me with a problem
> regarding the NSCA plugin. It seems that under certain
> cirumstances (unfortunately, those circumstances are unknown
> to him and thus me as well), NSCA just kind of hangs (an
> strace shows basically an idle screen) and these sorts of
> errors start flooding the daemon log:
>
>
> nsca[28640]: Network server accept failure (9: Bad file descriptor)
>
>
> The quick fix is to restart NSCA, and then everything hums
> along until the next incident.
>
>
> It's possible there's a bad block on the disk or something,
> and an fsck
> might yield some clues, but I haven't had the chance to schedule
> downtime to do that yet. It's also possible it's hitting the
> fd limit,
> but in the time I've been monitoring it, I don't see any
> leaking of fd's
> that would point to that as a suspect (the limit is the default of
> 1024). Additionally, according to ulimit, the pipe size is 4k, which
> could be an issue as the nsca clients write to a pipe on the server
> (nagios.cmd), but that's only an option configurable at kernel
> compile-time and I expect I'd see more widespread reports of problems
> from other folks in the community if overflowing the default
> pipe buffer
> was really the issue.
>
> I've seen some sparse reports on Google of a similar problem, but
> they're just that - sparse. Which kind of makes me think it's not
> Nagios or NSCA, but a bad block on the hard drive. Anybody have a
> similar experience or opinion?
We had similar problems when there were a lot of passiv services and it seemed that NSCA was simply getting overloaded. To be honest, I am not sure that we were getting "Bad file descriptor" but NSCA would not accept any new connection and the solution was to restart it.
Regards,
Jim Mohr
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list