checks, notifications don't work after time period exception

Jon Angliss jon at netdork.net
Thu Aug 28 06:00:19 CEST 2008


On Tue, 26 Aug 2008 11:09:15 -0500, Mark Young <myoung at nagios.org>
wrote:

>> I'll probably do away with the time exclusion as it still isn't  
>> working.
>> I have one service that went critical during the exclusion which has a
>> 24x7 check but 4-5:30 notification exclusion and now I can't make it
>> send a notice.  I disabled notifications for that service and enabled
>> again; also restarted Nagios and still won't send a notice.  It's
>> configured to send on the second attempt and every hour after.
>
>You should be able to use timeperiods the way you would like to.  I  
>will have to setup my test environment to see if I get the same  
>problem you are describing.  It would be nice to figure this out.  Is  
>this a problem with other users?

I was going to setup a little test environment to retest it,
unfortunately "end of the world" projects keep coming up at work.

>On Aug 25, 2008, at 6:28 PM, Jon Angliss wrote:
>> I had a similar issue. I tried excluding a time slot between 0300, and
>> 0600 due to large DB loads causing website performance issues
>> (backups, indexing, and such).  When the start of the exception rolled
>> around, nagios would just stop checking that service, and would only
>> restart on a forced manual check, or restart of the nagios service.  I
>> ended up changing the theory to continue checking 24x7, but only alert
>> when outside the exception.  Not that it helps the issue, but it
>> stopped the symptoms.  I never got around to really digging into the
>> issue any deeper.  It did give me the added benefit of knowing when
>> the server was actually down during the maintenance window, and when
>> it was just being slow.  This was handy for management purposes so
>> they could calculate if they need more head count.

>Hi Jon,
>So it was following your notification timeperiod but not your check  
>timeperiod?  Was version of Nagios were you running?  Were you using  
>the exclude function (new to nagios 3.x)?  Nagios 3.x did add new  
>functionality to timeperiods.  It is possible that a bug was introduced.

I had defined a timeperiod of "ProdSLA" which was 24x7, then an
exclusion timeperiod called "ProdMaint", which was 0300-0500 daily,
then added that as an excluded time period for ProdSLA.  This was used
for checks, not notifications.  As the start of the excluded time
period would be reached, it'd stop checking, as designed, but never
resuming checking at 0500.  I'll have to see if I can dig out the
config files from CVS.  It was a 3.x install.

-- 
Jon Angliss


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list