Number of Notifications!?

Hari Sekhon hpsekhon at googlemail.com
Thu Aug 31 16:40:35 CEST 2006


lennart.kvam at softronic.se wrote:
> no, that did not work:( Now i get mail every 4:th minute:) I only want 
> one... someone ?
>  
> Regards
> Lelle
>
> ------------------------------------------------------------------------
> *Från:* nagios-users-bounces at lists.sourceforge.net 
> [mailto:nagios-users-bounces at lists.sourceforge.net] *För 
> *lennart.kvam at softronic.se
> *Skickat:* den 31 augusti 2006 13:38
> *Till:* hpsekhon at googlemail.com
> *Kopia:* nagios-users at lists.sourceforge.net
> *Ämne:* Re: [Nagios-users] Number of Notifications!?
>
> Hi!
> thanks for a fast answer!
> Here are the configfiles i`m using
>  
> # Plugin commands (service and host check commands)
> # Arguments are likely to change between different releases of the
> # plugins, so you should use the same config file provided with the
> # plugin release rather than the one provided with Nagios.
> cfg_file=/usr/local/nagios/etc/checkcommands.cfg
>  
> # Misc commands (notification and event handler commands, etc)
> cfg_file=/usr/local/nagios/etc/misccommands.cfg
>  
> # You can split other types of object definitions across several
> # config files if you wish (as done here), or keep them all in a
> # single config file.
>  
> *cfg_file=/usr/local/nagios/etc/minimal.cfg*
> cfg_file=/usr/local/nagios/etc/escalations.cfg
>  
> #cfg_file=/usr/local/nagios/etc/contactgroups.cfg
> #cfg_file=/usr/local/nagios/etc/contacts.cfg
> #cfg_file=/usr/local/nagios/etc/dependencies.cfg
> #cfg_file=/usr/local/nagios/etc/hostgroups.cfg
> #cfg_file=/usr/local/nagios/etc/hosts.cfg
> #cfg_file=/usr/local/nagios/etc/services.cfg
> #cfg_file=/usr/local/nagios/etc/timeperiods.cfg
>  
> # Extended host/service info definitions are now stored along with
> # other object definitions:
> cfg_file=/usr/local/nagios/etc/hostextinfo.cfg
> cfg_file=/usr/local/nagios/etc/serviceextinfo.cfg
>  
> I will try this now....i`ll let you know if it works:)
>  
> Thanks
> Lelle
>
> ------------------------------------------------------------------------
> *Från:* Hari Sekhon [mailto:hpsekhon at googlemail.com]
> *Skickat:* den 31 augusti 2006 13:06
> *Till:* Kvam Lennart
> *Kopia:* nagios-users at lists.sourceforge.net
> *Ämne:* Re: [Nagios-users] Number of Notifications!?
>
> go and look in your nagios.cfg file for which config files it's 
> actually using, I don't think most people use minimal, but you won't 
> know until you look there.
>
> then find the service definition and change line
>
> notification_interval           0
>
> to
>
> notification_interval           60
>
> to have it email you once an hour when a service is down (assuming you 
> haven't changed the default interval length in nagios.cfg)
>
>
> That should sort it
>
>
>
> lennart.kvam at softronic.se wrote:
>> Hello all Nagios gurus!
>>  
>> I have a irritating problem, when a service goes down/critical i get 
>> a notification every second minute! Thats irritating:)
>>  
>> I`w also read the docs on notifications but still ì havent solved it! 
>> Can anyone help me please?:)
>> Here are som of my config: snip
>>  
>> Nagios.cfg:
>> command_check_interval=-1
>> log_notifications=1
>> interval_length=60
>> enable_notifications=1
>> status_update_interval=15
>>  
>>  
>> Minimal.cfg:
>>  
>> define host{
>>         name                            generic-host    ; The name of 
>> this host template
>>         notifications_enabled           1       ; Host notifications 
>> are enabled
>>         event_handler_enabled           1       ; Host event handler 
>> is enabled
>>         flap_detection_enabled          1       ; Flap detection is 
>> enabled
>>         failure_prediction_enabled      1       ; Failure prediction 
>> is enabled
>>         process_perf_data               1       ; Process performance 
>> data
>>         retain_status_information       1       ; Retain status 
>> information across program restarts
>>         retain_nonstatus_information    1       ; Retain non-status 
>> information across program restarts
>>         register                        0       ; DONT REGISTER THIS 
>> DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
>>         }
>>  
>> define service{
>>         name                    generic-service ; The 'name' of this 
>> service template
>>         active_checks_enabled           1       ; Active service 
>> checks are enabled
>>         passive_checks_enabled          1       ; Passive service 
>> checks are enabled/accepted
>>         parallelize_check               1       ; Active service 
>> checks should be parallelized (disabling this can lead to major 
>> performance problems)
>>         obsess_over_service             1       ; We should obsess 
>> over this service (if necessary)
>>         check_freshness                 1       ; Default is to NOT 
>> check service 'freshness'
>>         notifications_enabled           1       ; Service 
>> notifications are enabled
>>         event_handler_enabled           1       ; Service event 
>> handler is enabled
>>         flap_detection_enabled          1       ; Flap detection is 
>> enabled
>>         failure_prediction_enabled      1       ; Failure prediction 
>> is enabled
>>         process_perf_data               1       ; Process performance 
>> data
>>         retain_status_information       1       ; Retain status 
>> information across program restarts
>>         retain_nonstatus_information    1       ; Retain non-status 
>> information across program restarts
>>         register                        0       ; DONT REGISTER THIS 
>> DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
>>         }
>>  
>>  
>> define host{
>>         use                     generic-host            ; Name of 
>> host template to use
>>         host_name               STSSTOSIG
>>         alias                   STSSTOSIG
>>         address                 111.111.111.111
>>         check_command           check-host-alive
>>         max_check_attempts      10
>>         notification_interval   0
>>         notification_period     24x7
>>         notification_options    d,u,r,
>>         notifications_enabled  1
>>         contact_groups      admins
>>         }
>>  
>>
>> define  service {
>>     use                             generic-service         ; check_ping
>>         host_name                       STSSTOSIG
>>         service_description             PING
>>         is_volatile                     1
>>         max_check_attempts              5
>>         normal_check_interval           2
>>         retry_check_interval            1
>>         active_checks_enabled           1
>>         passive_checks_enabled          1
>>         check_period                    24x7
>>         parallelize_check               1
>>         obsess_over_service             1
>>         check_freshness                 1
>>         freshness_threshold             420
>>         event_handler_enabled           1
>>         low_flap_threshold              0
>>         high_flap_threshold             0
>>         flap_detection_enabled          1
>>         process_perf_data               1
>>         retain_status_information       1
>>         retain_nonstatus_information    0  
>>         contact_groups                  oas-hd,admins
>>         notification_interval           0
>>         notification_period             24x7
>>         notification_options            w,u,c,r
>>         notifications_enabled           1
>>         register                        1
>>     check_command          check_ping!100.0,20%!500.0,60%
>>         }
>>  
>> This is what i have in my minimal:) Something should be wrong...or?
>>  
>> Thanks
>> Lelle
>>  
>>                                    __\/__ 
>>                                 .  / ^  ^ \  .

as someone suggested, investigate flap detection. I haven't looked at 
this myself since I have always had this working consistently, unless 
the thing times out or the service is rebooted or something.

you may also want to look at the timeout option that most plugins 
provide, perhaps it doesn't get a response in the time and this flags 
critical, which may appear as flapping and may explain why you get 
emails every few mins instead of once every hour or two...

you could try running the check manually a lot of times and finding if 
it does occasionally fail for any reason. You can script this in Bash (I 
love that) or in something else you know, Perl or Python.... save the 
results to a log file to see what the returns are over time. If you 
timestamp each log entry then you can compare the emails you're getting 
against the log to see if the check was succeeding at that point. Or you 
could try the nagios history first off to compare.

If the service really is flapping then you need to fix that, not nagios.

food for thought.


Hari Sekhon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20060831/5072c12e/attachment.html>
-------------- next part --------------
-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


More information about the Users mailing list