No host, or service notifications received from Nagios 2.9 for critical states, Resolved

Mark Nassy marknassy at gmail.com
Wed Aug 22 15:50:15 CEST 2007


On Aug 22, 2007, at 6:54 AM, Mark Nassy wrote:

>
>> -----Original Message-----
>> From: nagios-users-bounces at lists.sourceforge.net [mailto:nagios- 
>> users-bounces at lists.sourceforge.net] On Behalf Of Mark Nassy
>> Sent: Wednesday, August 22, 2007 9:54 AM
>> To: nagios-users at lists.sourceforge.net
>> Subject: [Nagios-users] No host,or service notifications received  
>> from Nagios 2.9 for criticalstates
>>
>> no notifications are being received from nagios when a service is
>> down. i power off server03 and did not receive a notification from
>> nagios. the log shows no record of an attempt to send a notification.
>> i can manually send notifications successfully. any ideas why?
>>
>> here is the log with no record of a notification attempt.
>>
>> $ cat /opt/local/var/nagios/nagios.log
>> [1187737502] HOST ALERT: server03;DOWN;SOFT;9;CRITICAL - Plugin timed
>> out after 10 seconds
>> [1187737512] HOST ALERT: server03;DOWN;HARD;10;CRITICAL - Plugin
>> timed out after 10 seconds
>> [1187737512] SERVICE ALERT: server03;PING;CRITICAL;HARD;1;CRITICAL -
>> 192.168.10.127: rta nan, lost 100%
>> [1187738002] HOST ALERT: server03;UP;HARD;1;PING OK - Packet loss =
>> 0%, RTA = 0.45 ms
>> [1187738002] SERVICE ALERT: server03;PING;OK;HARD;1;OK -
>> 192.168.10.127: rta 0.589ms, lost 0%
>> [1187739982] Auto-save of retention data completed successfully.
>> ...
>> 1187741712] HOST ALERT: server03;DOWN;HARD;10;CRITICAL - Plugin timed
>> out after 10 seconds
>> [1187741712] SERVICE ALERT: server03;PING;CRITICAL;HARD;1;CRITICAL -
>> 192.168.10.127: rta nan, lost 100%
>> [1187743582] Auto-save of retention data completed successfully.
>> [1187745607] Caught SIGEXIT, shutting down...
>> [1187745607] Successfully shutdown... (PID=3492)
>> [1187745618] Nagios 2.9 starting... (PID=5949)
>> [1187745618] LOG VERSION: 2.0
>> [1187745618] Finished daemonizing... (New PID=5950)
>> [1187749218] Auto-save of retention data completed successfully.
>>
>>
>>
>> file system permissions review looks ok (to me).
>> $ ls -l
>> ...
>> -r-sr-xr-x   2 root  admin   46644 Aug 20 11:46 check_icmp
>> ...
>> -rwxr-xr-x   2 root  admin   42496 Aug 20 11:46 check_ping
>>
>>
>>
>> using the check_ping command returns the expected result for a host
>> or service that is down.
>> $ sudo -u nagios ./check_ping -H server03 -w 100.0,20% -c 500.0,60%
>> CRITICAL - Plugin timed out after 10 seconds
>>
>>
>> manually sending an email using the code from the notification
>> command works. i receive the email.
>> $ sudo -u nagios  /usr/bin/printf "%b" "***** Nagios 2.9 *****\n
>> \nNotification Type: CRITICAL\n\nService: PING\nHost: server03
>> \nAddress: 192.168.10.127\nState: down\n\nDate/Time: Today Aug 22nd\n
>> \nAdditional Info:\n\nTimeout" | /usr/bin/mail -s "** CRITICAL alert
>> - server03/PING is down **" it at intranet.com
>>
>>
>> config directory set and notifications turned on.
>> $ cat /opt/local/etc/nagios/nagios.cfg
>> ...
>> cfg_dir=/opt/local/etc/nagios/ny
>> ...
>> log_notifications=1
>>
>>
>>
>> $ cat /opt/local/etc/nagios/ny/contacts.cfg
>> ...
>> define contact{
>>          contact_name                    nagios-admin
>>          alias                           Nagios Admin
>>          service_notification_period     24x7
>>          host_notification_period        24x7
>>          service_notification_options    w,u,c,r
>>          host_notification_options       d,r
>>          service_notification_commands   notify-by-email
>>          host_notification_commands      host-notify-by-email
>>          email                           it at intranet.com
>>          }
>>
>>
>>
>>
>> $ cat /opt/local/etc/nagios/ny/contactgroups.cfg
>> ...
>> define contactgroup{
>>          contactgroup_name       admins
>>          alias                   Nagios Administrators
>>          members                 nagios-admin
>>          }
>>
>>
>>
>> note: the ping command actually runs check_icmp in the command line.
>> $ cat /opt/local/etc/nagios/ny/commands.cfg
>> ...
>>
>> # 'check_ping' command definition
>> define command{
>>          command_name    check_ping
>>          command_line    $USER1$/check_icmp -H $HOSTADDRESS$ -w $ARG1
>> $ -c $ARG2$ -p 5
>>          }
>> ...
>> # 'host-notify-by-email' command definition
>> define command{
>>          command_name    host-notify-by-email
>>          command_line    /usr/bin/printf "%b" "***** Nagios 2.9 *****
>> \n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState:
>> $HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time:
>> $LONGDATETIME$\n" | /usr/bin/mail -s "Host $HOSTSTATE$ alert for
>> $HOSTNAME$!" $CONTACTEMAIL$
>>          }
>>
>> # 'notify-by-email' command definition
>> define command{
>>          command_name    notify-by-email
>>          command_line    /usr/bin/printf "%b" "***** Nagios 2.9 *****
>> \n\nNotification Type: $NOTIFICATIONTYPE$\n\nService: $SERVICEDESC$
>> \nHost: $HOSTALIAS$\nAddress: $HOSTADDRESS$\nState: $SERVICESTATE$\n
>> \nDate/Time: $LONGDATETIME$\n\nAdditional Info:\n\n$SERVICEOUTPUT$"
>> | /usr/bin/mail -s "** $NOTIFICATIONTYPE$ alert - $HOSTALIAS$/
>> $SERVICEDESC$ is $SERVICESTATE$ **" $CONTACTEMAIL$
>>          }
>>
>>
>>
>>
>> $ cat /opt/local/etc/nagios/ny/hostgroups.cfg
>> ...
>> define hostgroup{
>>          hostgroup_name  windows-servers
>>          alias           Windows Servers
>>          members         server01,server02,server03
>>          }
>>
>>
>>
>>
>> $ cat /opt/local/etc/nagios/ny/hosts.cfg
>> ...
>> define host{
>>          use                     windows-servers          ; Name of
>> host template to use
>>                                                          ; This host
>> definition will inherit all variables that are defin$
>>                                                          ; in (or
>> inherited by) the windows-server host template definiti$
>>          host_name               server03
>>          alias                   Label Server
>>          address                 192.168.10.127
>>          }
>>
>>
>>
>> $ cat /opt/local/etc/nagios/ny/services.cfg
>> ...
>> define service{
>>          use                             remote-service         ;
>> Name of service template to use
>>          hostgroup                       windows-servers
>>          service_description             PING
>>          check_command                   check_ping!100.0,20%! 
>> 500.0,60%
>>          }
>
>
>
> On Aug 22, 2007, at 4:16 AM, Dennis Huenseler wrote:
>> Hello,
>>
>> if i checked your config right i think you have to define a host- 
>> template "windows-servers" with the parameter contact_groups where  
>> you define the contact_group if you want to use it for server03
>>
>>
>> hosts.cfg:
>>
>> define host{
>> 		host_name		windows-servers
>> 		check_period	24x7
>> 		etc
>> 		etc
>> ->         	contact_groups	admins	<-		
>> }
>>
>>
>> define host{
>>          use                     windows-servers          ; Name  
>> of host template to use
>>          host_name               server03
>>          alias                   Label Server
>>          address                 192.168.10.127
>>          }
>
> hi dennis.
>
> thanks for your reply. sorry i did not include that section of my  
> hosts.cfg in my original post. i do have windows-servers hosts  
> template defined. see below for template definition.
>
> $ cat /opt/local/etc/nagios/ny/hosts.cfg
> ...
> define host{
>         name                            windows-servers  ; The name  
> of this host template
>         use                             generic-host    ; This  
> template inherits other values from the generic-host template
>         check_period                    24x7            ; By  
> default, Windows hosts are checked round the clock
>         max_check_attempts              10              ; Check  
> each Windows host 10 times (max)
>         check_command                   check-host-alive ; Default  
> command to check Windows hosts
>         notification_period             workhours       ; Admins  
> hate to be woken up, so we only notify during the day
>                                                         ; Note that  
> the notification_period variable is being overridden from
>                                                         ; the value  
> that is inherited from the generic-host template!
>         notification_interval           120             ; Resend  
> notification every 2 hours
>         notification_options            d,u,r           ; Only send  
> notifications for specific host states
>         contact_groups                  admins          ;  
> Notifications get sent to the admins by default
>         register                        0               ; DONT  
> REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
>         }
>
>
> i also forgot to include the following from the nagios.cfg file
> $ cat /opt/local/etc/nagios/nagios.cfg
> ...
> enable_notifications=1

i am not sure why as i did not change any configuration in the nagios  
files but i just got a notification that server03 is down by email.  
if anything changes i will update this post.

thanks everyone for your help.


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list