<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML xmlns="http://www.w3.org/TR/REC-html40" xmlns:v =
"urn:schemas-microsoft-com:vml" xmlns:o =
"urn:schemas-microsoft-com:office:office" xmlns:w =
"urn:schemas-microsoft-com:office:word" xmlns:m =
"http://schemas.microsoft.com/office/2004/12/omml"><HEAD>
<META http-equiv=Content-Type content="text/html; charset=us-ascii">
<META content="MSHTML 6.00.2900.6036" name=GENERATOR><!--[if !mso]>
<STYLE>v\:* {
BEHAVIOR: url(#default#VML)
}
o\:* {
BEHAVIOR: url(#default#VML)
}
w\:* {
BEHAVIOR: url(#default#VML)
}
.shape {
BEHAVIOR: url(#default#VML)
}
</STYLE>
<![endif]-->
<STYLE>@font-face {
font-family: SimSun;
}
@font-face {
font-family: Mangal;
}
@font-face {
font-family: Cambria Math;
}
@font-face {
font-family: Calibri;
}
@font-face {
font-family: Tahoma;
}
@font-face {
font-family: @SimSun;
}
@page WordSection1 {size: 8.5in 11.0in; margin: 1.0in 1.0in 1.0in 1.0in; }
P.MsoNormal {
FONT-SIZE: 12pt; MARGIN: 0in 0in 0pt; FONT-FAMILY: "Times New Roman","serif"
}
LI.MsoNormal {
FONT-SIZE: 12pt; MARGIN: 0in 0in 0pt; FONT-FAMILY: "Times New Roman","serif"
}
DIV.MsoNormal {
FONT-SIZE: 12pt; MARGIN: 0in 0in 0pt; FONT-FAMILY: "Times New Roman","serif"
}
A:link {
COLOR: blue; TEXT-DECORATION: underline; mso-style-priority: 99
}
SPAN.MsoHyperlink {
COLOR: blue; TEXT-DECORATION: underline; mso-style-priority: 99
}
A:visited {
COLOR: purple; TEXT-DECORATION: underline; mso-style-priority: 99
}
SPAN.MsoHyperlinkFollowed {
COLOR: purple; TEXT-DECORATION: underline; mso-style-priority: 99
}
P {
FONT-SIZE: 12pt; MARGIN-LEFT: 0in; MARGIN-RIGHT: 0in; FONT-FAMILY: "Times New Roman","serif"; mso-style-priority: 99; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto
}
SPAN.EmailStyle18 {
COLOR: #1f497d; FONT-FAMILY: "Calibri","sans-serif"; mso-style-type: personal
}
SPAN.EmailStyle19 {
COLOR: #1f497d; FONT-FAMILY: "Calibri","sans-serif"; mso-style-type: personal-reply
}
.MsoChpDefault {
FONT-SIZE: 10pt; mso-style-type: export-only
}
DIV.WordSection1 {
page: WordSection1
}
</STYLE>
<!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></HEAD>
<BODY lang=EN-US vLink=purple link=blue>
<DIV dir=ltr align=left><SPAN class=990332315-07122010><FONT face=Arial
color=#0000ff size=2>Hi there --</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=990332315-07122010><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=990332315-07122010><FONT face=Arial
color=#0000ff size=2>The output shown below shows the top processes on the
server:</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=990332315-07122010><FONT face=Arial
color=#0000ff size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=990332315-07122010><FONT face=Arial
color=#0000ff size=2>439 processes: 438 sleeping, 1 running, 0 zombie, 0
stopped<BR>CPU0 states: 19.0% user, 9.4% system, 0.0% nice, 71.0%
idle<BR>CPU1 states: 20.1% user, 13.0% system, 0.0% nice, 66.3%
idle<BR>CPU2 states: 27.1% user, 17.3% system, 0.0% nice, 55.0%
idle<BR>Mem: 2064324K av, 2013820K used, 50504K
free, 0K shrd, 487764K buff<BR>Swap:
2096472K av, 12436K used, 2084036K
free
976244K cached</FONT></SPAN></DIV>
<DIV> </DIV>
<DIV dir=ltr align=left><SPAN class=990332315-07122010><FONT face=Arial
color=#0000ff size=2> PID USER PRI NI
SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND<BR> 2398
root 15 0 1280 1280
824 R 1.9 0.0 0:00 top<BR> 5648
root 22 0 1196 1196 1104
S 1.3 0.0 0:00
ASMProServer<BR> 1 root
15 0 488 484 448
S 0.0 0.0 2:28
init<BR> 2 root 0K
0 0 0 0
SW 0.0 0.0 0:00
migration_CPU0<BR> 3 root
0K 0 0
0 0 SW 0.0 0.0 0:00
migration_CPU1<BR> 4 root
0K 0 0
0 0 SW 0.0 0.0 0:00
migration_CPU2<BR> 5 root
15 0 0
0 0 SW 0.0 0.0 0:03
keventd<BR> 6 root 34
19 0 0 0
SWN 0.0 0.0 17:52 ksoftirqd_CPU0<BR> 7
root 34 19
0 0 0 SWN 0.0
0.0 16:39 ksoftirqd_CPU1<BR> 8
root 34 19
0 0 0 SWN 0.0
0.0 17:33 ksoftirqd_CPU2<BR> 9
root 15 0
0 0 0 SW 0.0
0.0 28:22 kswapd<BR> 10 root
15 0 0
0 0 SW 0.0 0.0 42:39
bdflush<BR> 11 root 15
0 0 0 0
SW 0.0 0.0 3:08 kupdated<BR> 12
root 25 0
0 0 0 SW 0.0
0.0 0:00 mdrecoveryd<BR> 18
root 16 0
0 0 0 SW 0.0
0.0 0:00 scsi_eh_0<BR> 21
root 15 0
0 0 0 SW 0.0
0.0 4:38 kjournald<BR> 101 root
15 0 0
0 0 SW 0.0 0.0 0:00
khubd<BR> 265 root 15
0 0 0 0
SW 0.0 0.0 0:03 kjournald<BR> 266
root 15 0
0 0 0 SW 0.0
0.0 3:43 kjournald<BR> 267 root
15 0 0
0 0 SW 0.0 0.0 0:04
kjournald<BR> 268 root 15
0 0 0 0
SW 0.0 0.0 0:01 kjournald<BR> 269
root 15 0
0 0 0 SW 0.0
0.0 0:11 kjournald<BR> 270 root
15 0 0
0 0 SW 0.0 0.0 4:34
kjournald<BR> 271 root 15
0 0 0 0
SW 0.0 0.0 4:28 kjournald<BR> 272
root 15 0
0 0 0 SW 0.0
0.0 0:08 kjournald<BR> 273 root
15 0 0
0 0 SW 0.0 0.0 0:14
kjournald<BR> 274 root 15
0 0 0 0
SW 0.0 0.0 0:07 kjournald<BR> 275
root 15 0
0 0 0 SW 0.0
0.0 1:14 kjournald<BR> 805 root
15 0 588 576 532
S 0.0 0.0 1:39 syslogd<BR> 810
root 15 0 448
432 432 S 0.0 0.0 0:00
klogd<BR> 830 rpc 15
0 596 572 508 S 0.0
0.0 0:04 portmap<BR> 858 rpcuser 19
0 708 608 608 S 0.0
0.0 0:00 rpc.statd<BR> 970 root
15 0 0
0 0 SW 0.0 0.0 0:21
rpciod<BR> 971 root 15
0 0 0 0
SW 0.0 0.0 0:00 lockd<BR> 999
ntp 15 0 1812 1812
1732 S 0.0 0.0 5:04 ntpd<BR> 1022
root 15 0 772
720 632 S 0.0 0.0 0:00
ypbind<BR> 1024 root 15
0 772 720 632 S 0.0
0.0 1:16 ypbind</FONT></SPAN></DIV>
<DIV><SPAN class=990332315-07122010><FONT face=Arial color=#0000ff
size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=990332315-07122010><FONT face=Arial color=#0000ff size=2>What
caught my eye was the number of processes along with the number of sleeping
processes.</FONT></SPAN></DIV>
<DIV><SPAN class=990332315-07122010><FONT face=Arial color=#0000ff size=2>I
tried running the kill command on the kjournald instances, but that did not
appear to stop them.</FONT></SPAN></DIV>
<DIV><SPAN class=990332315-07122010><FONT face=Arial color=#0000ff
size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=990332315-07122010><FONT face=Arial color=#0000ff size=2>Aside
from rebooting the server, which can be done if necessary, what other approach
can I try?</FONT></SPAN></DIV>
<DIV><SPAN class=990332315-07122010><FONT face=Arial color=#0000ff
size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=990332315-07122010><FONT face=Arial color=#0000ff
size=2> </DIV>
<DIV dir=ltr align=left><BR></DIV></FONT></SPAN><BR>
<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left>
<HR tabIndex=-1>
<FONT face=Tahoma size=2><B>From:</B> Daniel Wittenberg
[mailto:daniel.wittenberg.r0ko@statefarm.com] <BR><B>Sent:</B> Tuesday, December
07, 2010 9:11 AM<BR><B>To:</B> Nagios Users List<BR><B>Subject:</B> Re:
[Nagios-users] Determining what is causing a highloadreportedby check_load
plugin<BR></FONT><BR></DIV>
<DIV></DIV>
<DIV class=WordSection1>
<P class=MsoNormal><SPAN
style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'">So
what are the first few processes listed in top? That should be what is
causing your load then.<o:p></o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'"><o:p> </o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'">Dan<o:p></o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'"><o:p> </o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'"><o:p> </o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'"><o:p> </o:p></SPAN></P>
<DIV>
<DIV
style="BORDER-RIGHT: medium none; PADDING-RIGHT: 0in; BORDER-TOP: #b5c4df 1pt solid; PADDING-LEFT: 0in; PADDING-BOTTOM: 0in; BORDER-LEFT: medium none; PADDING-TOP: 3pt; BORDER-BOTTOM: medium none">
<P class=MsoNormal><B><SPAN
style="FONT-SIZE: 10pt; FONT-FAMILY: 'Tahoma','sans-serif'">From:</SPAN></B><SPAN
style="FONT-SIZE: 10pt; FONT-FAMILY: 'Tahoma','sans-serif'"> Kaplan, Andrew H.
[mailto:AHKAPLAN@PARTNERS.ORG] <BR><B>Sent:</B> Tuesday, December 07, 2010 7:49
AM<BR><B>To:</B> Nagios Users List<BR><B>Subject:</B> Re: [Nagios-users]
Determining what is causing a high loadreportedby check_load
plugin<o:p></o:p></SPAN></P></DIV></DIV>
<P class=MsoNormal><o:p> </o:p></P>
<P class=MsoNormal><SPAN
style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Arial','sans-serif'">Hi there
--</SPAN><o:p></o:p></P>
<P class=MsoNormal> <o:p></o:p></P>
<P class=MsoNormal><SPAN
style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Arial','sans-serif'">The load
values that are displayed in top match those for the check_load plugin. This is
the case whether the plugin</SPAN><o:p></o:p></P>
<P class=MsoNormal><SPAN
style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Arial','sans-serif'">is run
either automatically or interactively. The output for the uptime command is
shown below:</SPAN><o:p></o:p></P>
<P class=MsoNormal> <o:p></o:p></P>
<DIV>
<P class=MsoNormal><SPAN
style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Arial','sans-serif'">8:48am
up 153 days, 23:21, 1 user, load average: 73.36, 73.29,
73.21</SPAN><o:p></o:p></P></DIV>
<DIV>
<P class=MsoNormal> <o:p></o:p></P></DIV>
<DIV>
<P class=MsoNormal> <o:p></o:p></P></DIV>
<P class=MsoNormal> <o:p></o:p></P>
<P class=MsoNormal><o:p> </o:p></P>
<DIV class=MsoNormal style="TEXT-ALIGN: center" align=center>
<HR align=center width="100%" SIZE=2>
</DIV>
<P class=MsoNormal style="MARGIN-BOTTOM: 12pt"><B><SPAN
style="FONT-SIZE: 10pt; FONT-FAMILY: 'Tahoma','sans-serif'">From:</SPAN></B><SPAN
style="FONT-SIZE: 10pt; FONT-FAMILY: 'Tahoma','sans-serif'"> Daniel Wittenberg
[mailto:daniel.wittenberg.r0ko@statefarm.com] <BR><B>Sent:</B> Monday, December
06, 2010 4:40 PM<BR><B>To:</B> Nagios Users List<BR><B>Subject:</B> Re:
[Nagios-users] Determining what is causing a high load reportedby check_load
plugin</SPAN><o:p></o:p></P>
<P class=MsoNormal><SPAN
style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'">In
top, does it show the same load values? The status of your memory
shouldn’t cause the nagios plugin to report high cpu. What does the uptime
command say? Try running the check_load script by hand on that host and
verify it returns the same results.<o:p></o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'"><BR>Dan<o:p></o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'"><o:p> </o:p></SPAN></P>
<P class=MsoNormal><SPAN
style="FONT-SIZE: 11pt; COLOR: #1f497d; FONT-FAMILY: 'Calibri','sans-serif'"><o:p> </o:p></SPAN></P>
<DIV
style="BORDER-RIGHT: medium none; PADDING-RIGHT: 0in; BORDER-TOP: #b5c4df 1pt solid; PADDING-LEFT: 0in; PADDING-BOTTOM: 0in; BORDER-LEFT: medium none; PADDING-TOP: 3pt; BORDER-BOTTOM: medium none">
<P class=MsoNormal><B><SPAN
style="FONT-SIZE: 10pt; FONT-FAMILY: 'Tahoma','sans-serif'">From:</SPAN></B><SPAN
style="FONT-SIZE: 10pt; FONT-FAMILY: 'Tahoma','sans-serif'"> Marc Powell
[mailto:lists@xodus.org] <BR><B>Sent:</B> Monday, December 06, 2010 3:26
PM<BR><B>To:</B> Nagios Users List<BR><B>Subject:</B> Re: [Nagios-users]
Determining what is causing a high load reported by check_load
plugin<o:p></o:p></SPAN></P></DIV>
<P class=MsoNormal><o:p> </o:p></P>
<P class=MsoNormal style="MARGIN-BOTTOM: 12pt"><o:p> </o:p></P>
<DIV>
<P class=MsoNormal>On Mon, Dec 6, 2010 at 1:50 PM, Kaplan, Andrew H. <<A
href="mailto:AHKAPLAN@partners.org">AHKAPLAN@partners.org</A>>
wrote:<o:p></o:p></P>
<DIV>
<P><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'">Hi there
--</SPAN> <o:p></o:p></P>
<P><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'">We are
running Nagios 3.1.2 server, and the client that is the subject of this e-mail
is running version 2.6 of the nrpe client.</SPAN><o:p></o:p></P>
<P><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'">The
check_load plugin, version 1.4, is indicating the past three readings are the
following:</SPAN> <o:p></o:p></P>
<P><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'">load
average: 71.00, 71.00, 70.95 CRITICAL</SPAN> <o:p></o:p></P>
<P><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'">The critical
threshold of the plugin has been set to the 30, 25, 20 settings.</SPAN>
<o:p></o:p></P>
<P><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'">When I
checked the client in question, the first thing I did was to run the top
command. The results are shown below:</SPAN> <o:p></o:p></P>
<P><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'">CPU0
states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle</SPAN>
<BR><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'">CPU1
states: 0.0% user, 0.0% system, 0.0% nice, 100.0% idle</SPAN>
<BR><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'">CPU2
states: 1.0% user, 4.0% system, 0.0% nice, 93.0% idle</SPAN>
<BR><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'">Mem:
2064324K av, 2032308K used, 32016K
free, 0K shrd, 509924K buff</SPAN>
<BR><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'">Swap:
2096472K av, 21432K used, 2075040K
free
1035592K cached</SPAN> <o:p></o:p></P>
<P><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'">The one
thing that I noticed was the amount of free memory was at thirty-two megabytes.
I wanted to know if that was</SPAN> <BR><SPAN
style="FONT-SIZE: 10pt; FONT-FAMILY: 'Arial','sans-serif'">what was causing the
critical status to occur, or if there is something(s) else that I should
investigate.</SPAN><o:p></o:p></P></DIV>
<DIV>
<P class=MsoNormal style="MARGIN-BOTTOM: 12pt"><BR>Memory is not a factor in the
load calculation, only the number of processes running or waiting to run. For at
least 15 minutes you had approximately 71 processes either running or ready to
run and waiting on CPU resources. Running top/ps was the right thing to do but
you really need to do it when the problem is occurring to see what's actually
using all the CPU resources. There are far too many reasons why load could be
high but it should be easy for someone familiar with your system to figure it
out (at least generally) while
in-the-act.<BR><BR>--<BR>Marc<o:p></o:p></P></DIV></DIV>
<P class=MsoNormal><SPAN style="FONT-FAMILY: 'Courier New'"><BR><BR>The
information in this e-mail is intended only for the person to whom it
is<BR>addressed. If you believe this e-mail was sent to you in error and the
e-mail<BR>contains patient information, please contact the Partners Compliance
HelpLine at<BR>http://www.partners.org/complianceline . If the e-mail was sent
to you in error<BR>but does not contain patient information, please contact the
sender and properly<BR>dispose of the
e-mail.</SPAN><o:p></o:p></P></DIV></BODY></HTML>