FW: Problems with distributed setup, master overload?

Tyler Lund tlund at hostedsolutions.com
Wed Jun 20 18:05:56 CEST 2007


Hi Jeffrey				

I encountered the exact same issue. I've managed to bring it under control with two things, but have not yet found a solution. It seems nagios just stops reading the command pipe every once in a while and will resume after a few minutes. 

I implemented batching of service check results with OCPD:

http://www.nagioscommunity.org/wiki/index.php/OCP_Daemon

This has cut down on the number of NSCA children that spawn, and avoiding the fork bomb that used to happen before nagios would start reading the command pipe again. Usually nagios will recover on it's own before it becomes a problem. 

In the event I need to recover it manually, simply emptying the command pipe does the trick. All the stalled NSCA processes dump their data and exit. (I just cat nagios.cmd) This is not ideal, as all the pending check results are lost in the process.

Has anyone arrived at a permanent solution? 

Thanks!

-T


					

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list