first_notification_delay for hosts

Ethan Galstad nagios at nagios.org
Thu Dec 1 17:46:25 CET 2005


On 1 Dec 2005 at 12:33, Mathias Sundman wrote:

> On Thu, 24 Nov 2005, Andreas Ericsson wrote:
> 
> > This patch adds a variable to the host object configuration, 
> > first_notification_delay, which causes notifications for a host to be put off 
> > until a minimum amount of time has passed.
> >
> > This is intended to artificially mimic the service notification logic that 
> > allows some time to pass between a detected error and the first notification 
> > by forcing at least some "sleep-time" between the HARD detection of a downed 
> > host and the first notification sent for it.
> >
> > Because of how notifications are scheduled, this means that no host 
> > notifications are sent unless the host has been checked first the 
> > max_check_attempts times (run serially), waited until a service (or the host) 
> > has been checked again and then, if the host is still down, the notification 
> > is sent provided (first_notification_delay * interval_length) seconds has 
> > passed.
> >
> > I did the documentation update. All credits for the code should go to Mathias 
> > Sundman, a Sungard employee and also a customer of ours who sent the patch to 
> > me for review. I'm forwarding it to the list with his explicit consent. I've 
> > tested it and found it to be in good working order.
> 
> Ethan, do you think this patch has any chance of making it into Nagios 
> 2.0?
> 
> Just some background why I wrote this patch; Many of the hosts we monitor 
> are such that we can accept them to lose network connectivity for some 
> time (say 10-30 minutes), but if they go down permanently we want to be 
> notified of this.
> 
> To achieve this we had to setup a dummy notification group for the 
> host, and then use escalations to be notified after a number of 
> notification_intervals has elapsed. That solution had a number of 
> drawbacks and felt more like a work around than a real solution.
> 
> Then I searched the list archive and found a number of other people with 
> the same problem as me but no other solution than the escalation method.
> 
> So, I decided to but this patch together that works very well for us in 
> our production environment atleast...
> 
> Cheers // Mathias
> 

Since this involves adding new functionality, I won't be adding it to 
2.0.  However, this is a great idea and patch, so I will be 
committing it to CVS once I branch the 2.x code and starting working 
on 3.0.  I'll also be adding a similar option to delay the initial 
notification time for services.






Ethan Galstad,
Nagios Developer
---
Email: nagios at nagios.org
Website: http://www.nagios.org



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click




More information about the Developers mailing list