<span class="Apple-style-span" style="border-collapse: collapse; "><div>Hi, we just moved a Nagios 1.2 to Nagios 3 with 8500 service checks and 1200 host checks.</div><div><br></div><div>I experimented with PNP and had to offload the work to another server which has been mentioned by others.</div>
<div><br></div><div>What I did was create a ramdisk on the Nagios server which stored the performance data until it was batched off to the processing server (every couple of minutes), then had it batched off to a server that only does the graphing and web serving of the graphing pages. The maximum amount of performance data lost if the Nagios server crashed was only a couple of minutes.</div>
<div><br></div><div>So the service and host data check commands are just a script which -</div><div>* renames the data file with a date and time stamp </div><div>* checks whether the graphing server is up</div><div>* scp's across any datafiles if it is and cleans up</div>
<div><br></div><div>No complex code, just a reverse proxy on the Nagios server so that when you requested a graphing page it looked like it was being served from the Nagios server itself.</div><div><br></div><div>Because I CNAME the graphing server, I can quickly move it to another server running npcd.</div>
<div><br></div><div><br></div><div><br></div><div><br></div>Date: Fri, 6 Feb 2009 12:24:54 -0500<br>From: "Michael W. Lucas" <<a href="mailto:mwlucas@blackhelicopters.org" style="color: rgb(42, 93, 176); ">mwlucas@blackhelicopters.org</a>><br>
Subject: [Nagios-users] PNP performance<br>To: <a href="mailto:nagios-users@lists.sourceforge.net" style="color: rgb(42, 93, 176); ">nagios-users@lists.sourceforge.net</a><br>Message-ID:<br> <<a href="mailto:20090206172454.GA16459@bewilderbeast.blackhelicopters.org" style="color: rgb(42, 93, 176); ">20090206172454.GA16459@bewilderbeast.blackhelicopters.org</a>><br>
Content-Type: text/plain; charset=us-ascii<br><br>Hi,<br><br>I have Nagios 3.0.6 on FreeBSD 7.1/amd64, with ~1500 services on ~250<br>hosts. I've been investigating replacing our MRTG setup with PNP.<br>PNP runs fine, and we are keeping about 1450 MRTG graphs. Our plans<br>
call for at least doubling the number of services and graphs.<br><br>Adding process_perfdata.pl has increased the system load from<br>"minimal" to 10-20. I see that process_perfdata.pl will not run under<br>ePN.<br>
<br>Anyone out there have ways to reduce process_perfdata.pl's system<br>load, short of rewriting it in C or making it ePN-friendly?</span><br>