<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix"><i>When it comes to your last finding I
have no explanation. Just to understand you compare using -24H
with -10080M (-168H). Would it not be better to compare -24H and
-1440M. I have to get back to you on this but I would need to
get the result when running in cacheCli since you get the time
it takes,
<a class="moz-txt-link-freetext" href="http://www.bischeck.org/wp-content/uploads/2014/06/Bischeck_installation_and_administration_guide.html#toc-Section-4.4">http://www.bischeck.org/wp-content/uploads/2014/06/Bischeck_installation_and_administration_guide.html#toc-Section-4.4</a>.</i><i><br>
</i><i>
</i><br>
This was a typo. I was talking about -168H and -10080M. Also, I
used "bischeck cli.CacheCli" to check this. And I re-ran this now,
but not finding much difference between both of them (it takes
about 4-6 seconds to retrieve the value).<br>
<br>
Reg. other points, I have to get back to you. On a side note, I
have upgraded from redis-server 2.6 to 2.8, just to rule out any
version performance issues.<br>
<br>
Thanks,<br>
Rahul.<br>
<br>
<br>
On Thursday 18 September 2014 12:19 AM, Anders Håål wrote:<br>
</div>
<blockquote cite="mid:5419D7A2.3070902@ingby.com" type="cite">Hi
Rahul,
<br>
Looking at your threshold this means that you will retrieve max 6
values, which should not be that "hard" even if its a time based
query - using index is faster and is something we will look into
in the future.
<br>
Since you run the query every 120 sec it means that you currently
have at lest 5040 items in the cache for this each service, which
does not sound to bad. 10 services at least 50000 in total.
<br>
What I like you to check is the following:
<br>
- If you connect with some JMX client against bischeck you can see
all the different timers
<a class="moz-txt-link-freetext" href="http://www.bischeck.org/wp-content/uploads/2014/06/Bischeck_installation_and_administration_guide.html#toc-Chapter-5">http://www.bischeck.org/wp-content/uploads/2014/06/Bischeck_installation_and_administration_guide.html#toc-Chapter-5</a>.
The once that are related to threshold are inserting to start with
but check all the different timers if some one have long execution
time.
<br>
- Since the its the redis-server that are consume a high level of
CPU its interesting to see the configuration for redis - like the
amount of memory allocated. If redis need to swap its not good.
<br>
- Please check the redis log files.
<br>
- You can also connect to redis with redis-cli and run command
"monitor" to get a real time listing on the commands executed
against redis.
<br>
- Also check with top the percentage of %wa, waiting for io. How
much memory do you have on the server? Only running bischeck and
redis?
<br>
- How much cpu is bischeck consuming? Do you see any peaks?
<br>
- Also check the bischeck log to see any ERROR or WARN.
<br>
- And finally - has this been the behavior from the beginning or
has it increased over time? What happen if you restart bischeck
(not reload)?
<br>
<br>
Try to collect some more info so we can try to determine where the
issue is related.
<br>
<br>
When it comes to your last finding I have no explanation. Just to
understand you compare using -24H with -10080M (-168H). Would it
not be better to compare -24H and -1440M. I have to get back to
you on this but I would need to get the result when running in
cacheCli since you get the time it takes,
<a class="moz-txt-link-freetext" href="http://www.bischeck.org/wp-content/uploads/2014/06/Bischeck_installation_and_administration_guide.html#toc-Section-4.4">http://www.bischeck.org/wp-content/uploads/2014/06/Bischeck_installation_and_administration_guide.html#toc-Section-4.4</a>.<br>
<br>
<br>
Regards
<br>
Anders
<br>
<br>
<br>
<br>
<br>
<br>
On 09/17/2014 07:13 PM, Rahul Amaram wrote:
<br>
<blockquote type="cite">Hi,
<br>
I am observing very high CPU consumption by the java process and
redis-server. redis-server being single threaded it self is
taking 100% CPU. I have about 10 hosts, with about 10 services
each (with one service item per service). The time interval for
generation of value is 120s. The threshold that I have defined
is:
<br>
<br>
avg($$HOSTNAME$$-$$SERVICENAME$$-$$SERVICEITEMNAME$$[-24H],$$HOSTNAME$$-$$SERVICENAME$$-$$SERVICEITEMNAME$$[-96H],$$HOSTNAME$$-$$SERVICENAME$$-$$SERVICEITEMNAME$$[-168H],$$HOSTNAME$$-$$SERVICENAME$$-$$SERVICEITEMNAME$$[-336H],$$HOSTNAME$$-$$SERVICENAME$$-$$SERVICEITEMNAME$$[-504H],$$HOSTNAME$$-$$SERVICENAME$$-$$SERVICEITEMNAME$$[-672H])
<br>
<br>
However, currently, not more than 3 values, are available.
<br>
<br>
I am already running this on a c3.xlarge machine (4 cores) and
the load average is quite often > 4 resulting in delay of
generation of values. Any pointers in what could be causing the
high load would be much appreciated.
<br>
<br>
On a slightly different note, while using cli.CacheCli,
retrieving the value of a service item one week back using hours
(-24H) is considerably faster than retrieving it using minutes
(-10080M). Again, why does bischeck behave this way?
<br>
<br>
Thanks,
<br>
Rahul.
<br>
<br>
</blockquote>
<br>
<br>
</blockquote>
<br>
</body>
</html>
<br>
<img src="http://web.vizury.com/website/in/wp-content/themes/vizury/images/adtech_mailer.jpg">