Nagios Only Sending out Flapping Alerts

Goutos, Kevin kgoutos at libertymgt.com
Tue Oct 20 16:42:06 CEST 2009


Hello all,

 

First off, I really appreciate any feedback you can provide.  I've just
recently started working with Nagios and everything is working great,
except for notifications being sent out.  I've been searching all over
the net, comparing my configurations with others to see if there was
anything noticeable I was missing, but I can't seem to find anything.
I've included my configuration files below for you to look at.

 

I have been able to send a mail from the server (CentOS 5.3) using
sendmail.  I can also successfully send Email using the E-mail commands
in the commands.cfg file.  

[code]/usr/bin/mail "%b" "***** Nagios *****\n\nNotification Type:
$NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState: $HOSTSTATE$\nAddress:
$HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time: $LONGDATETIME$\n" |
/bin/mail -s "** $NOTIFICATIONTYPE$ Host Alert: $HOSTNAME$ is
$HOSTSTATE$ **" $CONTACTEMAIL$ <MY EMAIL ADDRESS>[/code]

 

Everything is being monitored correctly, just for some reason the
E-mails are not being sent out.  I should also note that if I do a "Send
custom host notification" in nagios on a host that has a critical
warning, I DO get an E-mail successfully.  It's just not automatically
sending them out.  

 

UPDATE: I do have some good news.  I did receive a notification last
night for a flapping alert.  However, it still is not sending out alerts
from being down, returning to up state, etc...

[quote]

[1255678430] SERVICE FLAPPING ALERT: AUSTIN-LAPTOP;CPU Load;STOPPED;
Service appears to have stopped flapping (4.0% change < 5.0% threshold) 

[1255678430] SERVICE NOTIFICATION: libertyadmins;AUSTIN-LAPTOP;CPU
Load;FLAPPINGSTOP (CRITICAL);notify-service-by-email;No route to
host[/quote]

 

So it appears I'm receiving flapping alerts, but nothing else.  In the
Nagios web interface, if I go to Configuration --> Contacts....I can see
that under Service Notification Options and Host Configuration Options I
only see "Flapping, Downtime" enabled.  I'm not getting downtime alerts,
but shouldn't there be more options there? Uptime....etc.

 

 

 

Is there something obvious I am missing in the config files?  I really
appreciate the help, please let me know what you think.

 

Templates.cfg

 

 

########################################################################
#######

########################################################################
#######

#

# CONTACT TEMPLATES

#

########################################################################
#######

########################################################################
#######

 

define contact{

       name                            generic-contact         ; The
name of this contact template

       host_notifications_enabled      1

       service_notifications_enabled   1

       host_notification_commands      notify-host-by-email

       service_notification_commands   notify-service-by-email

       service_notification_period     24x7                    ; service
notifications can be sent anytime

       host_notification_period        24x7                    ; host
notifications can be sent anytime

       service_notification_options    w,u,c,r,f,s             ; send
notifications for all service states, flapping events, and scheduled
downtime events

       host_notification_options       w,u,c,r,f,s             ; send
notifications for all host states, flapping events, and scheduled
downtime events

       register                        0                       ; DONT
REGISTER THIS DEFINITION - ITS NOT A REAL CONTACT, JUST A TEMPLATE!

        }

 

 

########################################################################
#######

########################################################################
#######

#

# HOST TEMPLATES

#

########################################################################
#######

########################################################################
#######

 

# Generic host definition template - This is NOT a real host, just a
template!

 

define host{

       name                               generic-host    ; The name of
this host template

       notifications_enabled              1                    ; Host
notifications are enabled

       event_handler_enabled              1                    ; Host
event handler is enabled

       flap_detection_enabled             1                    ; Flap
detection is enabled

       failure_prediction_enabled         1                    ; Failure
prediction is enabled

       process_perf_data                  1                    ; Process
performance data

       retain_status_information          1                    ; Retain
status information across program restarts

       retain_nonstatus_information       1                    ; Retain
non-status information across program restarts     

       notification_period                24x7                 ; Send
host notifications at any time

       check_period                       24x7                 ; By
default, Linux hosts are checked round the clock

       check_interval                     1                    ;
Actively check the host every 1 minutes

       retry_interval                     1                    ;
Schedule host check retries at 1 minute intervals

       max_check_attempts                 10                   ; Check
each Linux host 10 times (max)

       check_command                      check-host-alive     ; Default
command to check Linux hosts

       notification_period                workhours            ; Linux
admins hate to be woken up, so we only notify during the day

       notification_interval              120                  ; Resend
notifications every 2 hours

       notification_options               w,u,c,r,f,s          ; Only
send notifications for specific host states

       contact_groups                     libertyadminsgroup   ;
Notifications get sent to the admins by default

       register                          0                    ; DONT
REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!

        }

 

 

 

# Windows host definition template - This is NOT a real host, just a
template!

 

define host{

       name                        windows-server       ; The name of
this host template

       use                         generic-host         ; Inherit
default values from the generic-host template

       check_period                24x7                 ; By default,
Windows servers are monitored round the clock

       check_interval              1                    ; Actively check
the server every 1 minutes

       retry_interval              1                    ; Schedule host
check retries at 1 minute intervals

       max_check_attempts          10                   ; Check each
server 10 times (max)

       check_command               check-host-alive     ; Default
command to check if servers are "alive"

       notification_period         24x7                 ; Send
notification out at any time - day or night

       notification_interval       60                   ; Resend
notifications every 30 minutes

       notification_options        w,u,c,r,f,s          ; Only send
notifications for specific host states

       contact_groups              libertyadminsgroup   ; Notifications
get sent to the admins by default

       hostgroups                  windows-servers      ; Host groups
that Windows servers should be a member of

       register                    0                    ; DONT REGISTER
THIS - ITS JUST A TEMPLATE

       }

 

 

# Define a template for switches that we can reuse

define host{

       name                        switches                    ; The
name of this host template

       use                         generic-host                ; Inherit
default values from the generic-host template

       check_period                24x7                        ; By
default, switches are monitored round the clock

       check_interval              5                           ;
Switches are checked every 5 minutes

       retry_interval              1                           ;
Schedule host check retries at 1 minute intervals

       max_check_attempts          10                          ; Check
each switch 10 times (max)

       check_command               check-host-alive            ; Default
command to check if routers are "alive"

       notification_period         24x7                        ; Send
notifications at any time

       notification_interval       30                          ; Resend
notifications every 30 minutes

       notification_options        w,u,c,r,f,s                 ; Only
send notifications for specific host states

       contact_groups              libertyadminsgroup          ;
Notifications get sent to the admins by default

       hostgroups                  switches

       register                    0                           ; DONT
REGISTER THIS - ITS JUST A TEMPLATE

       }

 

define host{

       name                        routers                     ; The
name of this host template

       use                         generic-host                ; Inherit
default values from the generic-host template

       check_period                24x7                        ; By
default, switches are monitored round the clock

       check_interval              5                           ;
Switches are checked every 5 minutes

       retry_interval              1                           ;
Schedule host check retries at 1 minute intervals

       max_check_attempts          10                          ; Check
each switch 10 times (max)

       check_command               check-host-alive            ; Default
command to check if routers are "alive"

       notification_period         24x7                        ; Send
notifications at any time

       notification_interval       30                          ; Resend
notifications every 30 minutes

       notification_options        w,u,c,r,f,s                 ; Only
send notifications for specific host states

       contact_groups              libertyadminsgroup          ;
Notifications get sent to the admins by default

       hostgroups                  routers

       register                    0                           ; DONT
REGISTER THIS - ITS JUST A TEMPLATE

       }

 

 

 

########################################################################
#######

########################################################################
#######

#

# SERVICE TEMPLATES

#

########################################################################
#######

########################################################################
#######

 

# Generic service definition template - This is NOT a real service, just
a template!

 

define service{

        name                            generic-service        ; The
'name' of this service template

        active_checks_enabled           1                      ; Active
service checks are enabled

        passive_checks_enabled          1                      ; Passive
service checks are enabled/accepted

        parallelize_check               1                      ; Active
service checks should be parallelized (disabling this can lead to major
performance problems)

        obsess_over_service             1                      ; We
should obsess over this service (if necessary)

        check_freshness                 0                      ; Default
is to NOT check service 'freshness'

        notifications_enabled           1                      ; Service
notifications are enabled

        event_handler_enabled           1                      ; Service
event handler is enabled

        flap_detection_enabled          1                      ; Flap
detection is enabled

        failure_prediction_enabled      1                      ; Failure
prediction is enabled

        process_perf_data               1                      ; Process
performance data

        retain_status_information       1                      ; Retain
status information across program restarts

        retain_nonstatus_information    1                      ; Retain
non-status information across program restarts

        is_volatile                     0                      ; The
service is not volatile

        check_period                    24x7                   ; The
service can be checked at any time of the day

        max_check_attempts              3                      ;
Re-check the service up to 3 times in order to determine its final
(hard) state

        normal_check_interval           3                      ; Check
the service every 10 minutes under normal conditions

        retry_check_interval            1                      ;
Re-check the service every two minutes until a hard state can be
determined

        contact_groups                  libertyadminsgroup     ;
Notifications get sent out to everyone in the 'admins' group

        notification_options            w,u,c,r,f,s            ; Send
notifications about warning, unknown, critical, and recovery events

        notification_interval           60                     ;
Re-notify about service problems every hour

        notification_period             24x7                   ;
Notifications can be sent out at any time

        register                        0                      ; DONT
REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!

        }

 

 

Commands.cfg

 

 

# 'notify-host-by-email' command definition

define command{

       command_name  notify-host-by-email

       command_line  /usr/bin/mail "%b" "***** Nagios
*****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState:
$HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time:
$LONGDATETIME$\n" | /bin/mail -s "** $NOTIFICATIONTYPE$ Host Alert:
$HOSTNAME$ is $HOSTSTATE$ **" $CONTACTEMAIL$

       }

 

# 'notify-service-by-email' command definition

define command{

       command_name  notify-service-by-email

       command_line  /usr/bin/mail "%b" "***** Nagios
*****\n\nNotification Type: $NOTIFICATIONTYPE$\n\nService:
$SERVICEDESC$\nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState:
$SERVICESTATE$\n\nDate/Time: $LONGDATETIME$\n\nAdditional
Info:\n\n$SERVICEOUTPUT$" | /bin/mail -s "** $NOTIFICATIONTYPE$ Service
Alert: $HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$ **" $CONTACTEMAIL$

       }

 

 

Services.cfg

 

define service{

       use                  generic-service

       hostgroup_name       windows-servers

       service_description System Uptime

       check_command        check_nt!UPTIME

       }

 

define service{

       use                  generic-service

       hostgroup_name       windows-servers

       service_description CPU Load      

       check_command        check_nt!CPULOAD!-l 5,80,90

       }

 

define service{

       use                  generic-service

       hostgroup_name       windows-servers

       service_description Memory Usage

       check_command        check_nt!MEMUSE!-w 80 -c 90

       }

 

define service{

       use                  generic-service

       hostgroup_name       windows-servers

       service_description  Used Disk Space      

       check_command        check_nt!USEDDISKSPACE!-l c -w 80 -c 90

       }

 

define service{

       use                  generic-service

       hostgroup_name       windows-servers

       service_description Ping Test

       check_period                24x7

       max_check_attempts          3

       normal_check_interval       3

       retry_check_interval        1

       contact_groups              libertyadminsgroup

       notification_interval       60

       notification_period         24x7

       notification_options        w,u,c,r

       check_command               check_ping!200.0,20%!600.0,60%     ;
The command used to monitor the service

       }

 

 

Hosts.cfg

 

define host{

       use           windows-server       ; Inherit default values from
a template

       host_name     NAME                 ; The name we're giving to
this host

       alias         NAME                 ; A longer name associated
with the host

       address       <IP ADDRESS>         ; IP address of the host

       }

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20091020/028e5ced/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


More information about the Users mailing list