[naemon-dev] Naemon daemon memory utilization
Jason Cook
jasonc at liquidgravity.com
Mon Jun 9 14:17:34 CEST 2014
On Jun 9, 2014, at 4:43 AM, Andreas Ericsson <ageric79 at gmail.com> wrote:
> Heya Jason. Long time no see. How's things?
>
> On 2014-06-05 16:19, Jason Cook wrote:
>>
>> On May 22, 2014, at 2:42 AM, Sven Nierlein <Sven.Nierlein at Consol.de> wrote:
>>
>>> On 13/05/14 15:59, Jason Cook wrote:
>>>>>> On 07/05/14 16:37, Jason Cook wrote:
>>>>>>> Yep, the Naemon core process definitely is the one that grows and is 100% reproducible for me. It grows to the max available memory on the box, then gets OOM killed. Doesn't happen when mod_gearman isn't enabled. I've seen it may also be happening with Nagios 4 as well, but haven't tested it myself.
>>>>>>>
>>>>>>> Test environment is RHEL 6u4.
>>>>>>>
>>>>>>> The valgrind log wasn't mine, but we seem to have very similar setups.
>>>>>>>
>>>>>> Could you try the latest version of mod-gearman? I fixed some potential memory leacks which may occure in case
>>>>>> of connection errors.
>>>>>>
>>>> Looks like it’s still swelling.. after ~19 hours..
>>>
>>> I found another memory leak. Seems like the way check result were freed has changed, so mod-gearman has to do that by itself now.
>>> Could you try the latest git HEAD of mod-gearman? In my tests, memory usage was constant over the last 12 hours.
>>>
>>> Sven
>>
>> Just to follow up on this, it’s a lot better, though still happening (albeit much, much slower)…
>>
>> nagios 22629 3.6 18.9 2208276 1523904 ? Ssl May30 317:59 /usr/bin/naemon -d /etc/naemon/naemon.cfg
>>
>> After running for nearly a week, it’s at ~1.5GB memory usage… Here it is in a 60 second snapshot..
>>
>> nagios 22629 3.6 18.9 2208276 1524700 ? Ssl May30 318:10 /usr/bin/naemon -d /etc/naemon/naemon.cfg
>> nagios 22629 3.6 18.9 2208276 1524916 ? Ssl May30 318:12 /usr/bin/naemon -d /etc/naemon/naemon.cfg
>>
>> Growing very, very slowly, but still growing.
>>
>
> That looks like a small-ish string or a container for something is
> being leaked continuously. Are you using a lot of on-demand macros,
> or custom object variables?
>
> I'm trying to think of things we may have overlooked when running
> valgrind tests here. Normally, naemon doesn't leak at all, but it
> seems we haven't tested every possible feature in a long-running
> system.
>
> Worst case scenario, memory is lost due to fragmentation, but it eats
> RAM a little bit too fast for it to be that.
>
> /Andreas
No on-demand macros or custom object variables - our configs are really, really straight forward. This example is a small-ish config, about 1300 hosts and 11,000 service objects.
More information about the Naemon-dev
mailing list