Growing number of orphaned service checks...

Charles Dee Rice cdrice at pobox.com
Wed Mar 2 23:34:33 CET 2005


Hello!  I've been lurking on the lists for a while now.  I have a
smallish-environment of 41 hosts, 397 active checks and 5 passive checks. 
Most of my active checks are using nrpe-2.0, and my passive checks are
submitted via nsca-2.4.  I noticed recently that some active checks did not
appear to be completing in a timely fashion (most are set to an interval of 7
minutes, but seemed to be taking an hour or more to complete), so I turned on
check_for_orphaned_services.

Then I saw a rather alarming number of services which nagios was detecting as
orphaned, and rescheduled for immediate checks.  The longer nagios is left
running, the longer and longer this list becomes (although it does not
contain a predictable list of services; in other words, it's not the "same"
services being orphaned all the time), and the more and more nagios processes
are left running ("Process Info" reports upwards of 600+ nagios processes
running).

I can restart nagios to "catch up" for some time, but left running, the
orphaned list begins to grow again.  The monitored nodes are not
heavily-taxed either.

My management server is used for other web services (other in-house business
web pages, user interfaces, etc), but is not in my opinion unusually busy or
overtaxed.

I've experimented changing my max_concurrent_checks value from the default of
0 to values both above and below what is recommended by running "nagios -s",
with no noticeable improvement.  I've tried extending my
normal_check_interval, and that seemed to delay the initial onset of the
problem (it took longer to start seeing orphaned checks, but they continued
to grow just the same).

I'll be happy to post any specific configuration or log file entries as
anyone sees appropriate, but didn't want to clutter the list with more info
than needed.

I'm using nagios-1.2 on a Linux box running Red Hat Enterprise Linux ES
release 2.1 (Panama), linux kernel 2.4.9-e.27.  Updating the Linux release is
not an option (corporate standard and configuration freeze), and running
nagios-2-beta is highly undesirable due to it's "beta" state and the usual
corporate fear of beta-release software.  However, if this issue is addressed
in nagios-2, I might be able to make a business case to upgrade.

If there are resources available online which I could use to help
troubleshoot this issue, please point me to them.  I've quite throughly
reviewed the 1.2 and 2-beta docs, FAQs and mail archives, and haven't found a
solution.  If there is more information I can post regarding my
configuration, please ask away...

Thanks - Chuck



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list