Nagios is ignoring the retry_interval setting
    FTL Nagios 
    ftlnagios at gmail.com
       
    Fri Dec  7 12:15:45 CET 2012
    
    
  
Re-tested after changing the max file size of the debug file.
This one should contain everything from the moment I started Nagios to the
moment I stopped it during testing (approx. 10 minutes)
http://dl.dropbox.com/u/895609/nagios.debug
Thankyou
-----Original Message-----
From: FTL Nagios [mailto:ftlnagios at gmail.com] 
Sent: 07 December 2012 10:56
To: 'zarrelli at linux.it'; 'Nagios Users List'
Subject: RE: [Nagios-users] Nagios is ignoring the retry_interval setting
Hi,
Apologies for the delay, been very busy with other things.
Right I have put Nagios into Debug this morning and rerun the tests.
I let it get a couple of successful pings to the server then pulled the
network cable from it.
Behaviour is completely different this morning!!!!
The host check is behaving now and rechecking every 3 minutes as its told
too in the host template. I got my text and email alert to say the host was
down when I expected it!
But now its the service check that is running every 1 minute now, which its
not told too when in problem state.
My service template clearly states  when in problem state to retry_interval
of 3 minutes:
define service{
    name                 service-server        ; The name of this host
template (used above in the checks)
    check_period             server_24x7        ; Server are monitored at
all times
    check_interval             1                ; Server are checked every 1
minute when in OK state
    retry_interval             3                ; Server checked every 3
minutes if in problem state
    max_check_attempts         3                ; Server checked 3 times to
determine if its Up or Down state
    notification_period         server_24x7        ; Emails and Text are
sent out any time of day
    notification_interval         3                ; Resend Notifications
every 3 minutes
    notification_options         c,r            ; Only send alerts for
servers in CRITICAL or RECOVERY state
    notifications_enabled         0                ; Notifications are
disabled
    contact_groups             servers email, servers sms    ; Alerts sent
to contacts in these groups
    event_handler_enabled         1                ; Host event handler is
enabled
    process_perf_data         1                ; Performace data is
processed
    retain_status_information    1                ; Status Info is kept
between server restarts
    retain_nonstatus_information 1                ; Non-Status information
is kept between server restarts
    passive_checks_enabled         0                ; Passive Checks are
disabled
    obsess_over_service         0                 ; We do not obsess over
the server if in problem state
    check_freshness              0                 ; We do not check this
server for freshness
    flap_detection_enabled         0                ; Flap Detection is
disabled
    failure_prediction_enabled   0                ; We will wait for it to
actually fail thankyou!!
    }
And even though its checking every minute, it went straight to Hard State on
the first check it detected it down and has stayed on check 1/3 Hard State
throughout
I really don't understand what is happening here.
The only thing different between this setup and my old nagios box is the
version - old box was 3.31, this new server is 3.4.1, I am using the same
config files that worked fine before.
Here is the debug logfiles of the above testing.
http://dl.dropbox.com/u/895609/nagios.debug1
http://dl.dropbox.com/u/895609/nagios.debug2
If you see anything please let me know, im getting angry with all the
alerts!!! :-)
Thankyou
-----Original Message-----
From: Giorgio Zarrelli [mailto:zarrelli at linux.it]
Sent: 29 November 2012 19:24
To: Nagios Users List
Subject: Re: [Nagios-users] Nagios is ignoring the retry_interval setting
Hi,
do not seee anything wrong. Could you set debug=-1
repeat the problem and put the log online?
Giorgio
<quota chi="Andrew Thompson">
> Hi Georgio,
>
> The whole test cfg I am using to try troubleshoot this can be found at:
>
> http://dl.dropbox.com/u/895609/test.cfg
>
> This is a direct copy of my main servers config but with the rest of 
> the servers and some templates for other server checks taken out
>
>
>
> Kind Regards
> Andrew
>
> From: Andrew Thompson
> Sent: 29 November 2012 16:11
> To: nagios-users at lists.sourceforge.net
> Subject: Nagios is ignoring the retry_interval setting
>
> Hi,
>
> My nagios box has decided to stop listening to the retry_interval 
> entry in my templates.
>
> My server template reads:
>
> define host{
>      name                       host-server
>      check_period              server_24x7
>      check_interval            1
>      retry_interval            3
>      max_check_attempts        3
>      notification_period       server_24x7
>      notification_interval      3
>      notification_options      d,r
>      notifications_enabled      1
>      contact_groups            servers email, servers sms
>      event_handler_enabled      1
>      process_perf_data         1
>      retain_status_information    1
>      retain_nonstatus_information 1
>      passive_checks_enabled          0
>      obsess_over_host          0
>      check_freshness          0
>      flap_detection_enabled          0
>      failure_prediction_enabled   0
>      }
>
> Now this is what happens:
>
>
> *         Server goes down at 1pm.
>
> *         I check the next scheduled check and it clearly states 1.03pm
>
> *         But at 1.01pm it checks again and then spits out an email and
> text message saying the server is down.
>
> Completely ignoring the retry_interval setting!!!
>
> Id expect from the above:
>
>
> *         1pm server goes down
>
> *         1.03pm check 2 is done
>
> *         1.06pm check 3 is done and determined hard state.
>
> *         At 1.06pm the notification should be sent out.
>
> Why is this, is something in my config wrong?
>
> Ubuntu 12.04 desktop and Nagios 3.4.1
>
> Thanks
>
>
> ----------------------------------------------------------------------
> -------- Keep yourself connected to Go Parallel:
> VERIFY Test and improve your parallel project with help from experts 
> and peers.
> http://goparallel.sourceforge.net_____________________________________
> __________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when 
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
----------------------------------------------------------------------------
--
Keep yourself connected to Go Parallel: 
VERIFY Test and improve your parallel project with help from experts and
peers. http://goparallel.sourceforge.net
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting
any issue. 
::: Messages without supporting info will risk being sent to /dev/null
------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null
    
    
More information about the Users
mailing list