RETRY: CPU Question
Fred Albrecht
Fred.Albrecht at za.tiscali.com
Mon Apr 7 15:43:59 CEST 2003
The system idles at 80% because I run other apps on the machine as well, like RTG to get my router interface stats, scripts to check passives' etc, and this runs continuously. I have used top, and the only processes that really load the system are the cgis. I've also placed some checks in the status.c file to time how long it takes to run different parts of itself. What I found was:
TIME TO read_all_object_configuration_data=7.000000
TIME TO read_all_status_data=3.000000
TIME TO finish all=6.000000
TIME TO run=16.000000
These are in seconds. So to read the object configuration data takes 43% of the time, to read status data 18% and to generate the web interface 39%.
Lane, what type of system do you run, the specs, to so that I can compare with what I have. Please.
Thanx
:)
fred
-----Original Message-----
From: Williams, P. Lane [mailto:Lane.Williams at jhuapl.edu]
Sent: 07 April 2003 01:52 PM
To: Fred Albrecht
Subject: RE: [Nagios-users] RETRY: CPU Question
The fact that your system idles at 80% normally, may have something to do with it. All Linux distributions I've used have typically idles at 98.xxx% or better, when not under a load. I've also done what you've done, where I would cycle through the cgi's to test performance...I would see momentary spike in cpu use.....but only around 30%-40%. At the moment I do not have as many checks as you. If you haven't already, you may want to use 'top' and see if you have any run away processes or possible memory leaks.
Lane
-----Original Message-----
From: Fred Albrecht [mailto:Fred.Albrecht at za.tiscali.com]
Sent: Monday, April 07, 2003 7:33 AM
To: Williams, P. Lane
Subject: RE: [Nagios-users] RETRY: CPU Question
No, I am saying that no swap is being used, there's no need. The system is configured with a Gig's worth of swap, but everything manages to run in memory without swapping to disk. Looking at the system now there is 3Meg swap used, 980M free. 43MB normal memory free. Thanx for your reply.
-----Original Message-----
From: Williams, P. Lane [mailto:Lane.Williams at jhuapl.edu]
Sent: 07 April 2003 01:15 PM
To: Fred Albrecht
Subject: RE: [Nagios-users] RETRY: CPU Question
Are you saying you have no "swap" file?
Lane
-----Original Message-----
From: Fred Albrecht [mailto:Fred.Albrecht at za.tiscali.com]
Sent: Monday, April 07, 2003 4:04 AM
To: nagios-users at lists.sourceforge.net
Subject: [Nagios-users] RETRY: CPU Question
Hi
Not having received a reply on my previous question, I'll try again. :) (Please tell me where I can ask this question, if this is the wrong place to ask.)
My cgi's take about 30 seconds from clicking on their links to displaying something on my screen. I'm running a P4, 512M Red Hat 7.2 (uname shows Linux 2.4.20). System idles at 80% CPU free most of the time, until I hit a cgi which drops the idle down to 0%, until the cgi finishes (as mentioned earlier, 25-30seconds later) and the system goes back to 80% idle. No swap is being used.
I've done the following optimizations:
Placed my critical files on ramdisk. They are:
-rwxr-xr-x 1 nagios nagios 755 Apr 4 15:43 contactgroups.cfg
-rwxr-xr-x 1 nagios nagios 2822 Apr 4 15:43 contacts.cfg
-rwxr-xr-x 1 nagios nagios 14999 Apr 7 09:43 hostextinfo.cfg
-rwxr-xr-x 1 nagios nagios 1565 Apr 4 15:43 hostgroups.cfg
-rwxr-xr-x 1 nagios nagios 26585 Apr 4 15:43 hosts.cfg
-rwxr-xr-x 1 nagios nagios 536 Apr 4 15:43 hosts-uses.cfg
drwxr-xr-x 2 nagios nagios 12288 Apr 3 16:23 lost+found
-rwxr-xr-x 1 nagios nagios 3092 Apr 4 15:43 misccommands.cfg
-rwxr-xr-x 1 nagios nagios 1987817 Apr 4 15:43 serviceextinfo.cfg
-rwxr-xr-x 1 nagios nagios 1696675 Apr 4 15:43 services.cfg
-rwxr-xr-x 1 nagios nagios 3941 Apr 4 15:43 services-uses.cfg
-rw-r--r-- 1 nagios nagiocmd 759981 Apr 7 09:50 status.log
-rw-rw-r-- 1 nagios nagios 209360 Apr 7 09:43 status.sav
-rwxr-xr-x 1 nagios nagios 1112 Apr 4 15:43 timeperiods.cfg
retention_update_interval=15
aggregate_status_updates=15
My nagios stats are as follows:
Check Execution Time: 0 / 7 / 0.052 sec
Check Latency: 0 / 14 / 0.605 sec
# Active Checks: 3404
# Passive Checks: 334
I've done everything that I could implement in the "Tuning Nagios For Maximum Performance" section.
At one stage I even nfs mounted the nagios directory to another machine from which I let my clients access the cgi's. Sharing CPU this way worked fine, meaning that whenever the web interface becomes too slow, I'll just add another server in my nagios farm. The only drawback is that the clients can't write to the nagios.cmd file accross the nfs mount. Would have been a nice feature if it did work. Which raises the next question. Nagios is a distributed NMS system, how about making it a distributed client interface system, if you follow what I mean? How can I get this done?
Is there anything else I can do to get the response time of the cgi's better? Is this a hardware or software issue?
Any suggestions will be highly appreciated.
Thanx
fred
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20030407/2bf6e267/attachment.html>
More information about the Users
mailing list