host down notification but no host up notification ?
stucky
stucky101 at gmail.com
Wed Jun 13 03:35:59 CEST 2007
Guys
I'm testing nagios 3.0a and I'm thinking there is a notification bug.
I have the following config:
define timeperiod{
timeperiod_name 24x7
alias 24 Hours A Day, 7 Days A Week
sunday 00:00-24:00
monday 00:00-24:00
tuesday 00:00-24:00
wednesday 00:00-24:00
thursday 00:00-24:00
friday 00:00-24:00
saturday 00:00-24:00
}
define contact{
name generic-contact ; The name
of this contact template
service_notification_period 24x7 ; service
notifications can be sent anytime
host_notification_period 24x7 ; host
notifications can be sent anytime
service_notification_options w,u,c,r,f,s ; send
notifications for all service states, flapping events, and scheduled
downtime events
host_notification_options d,u,r,f,s ; send
notifications for all host states, flapping events, and scheduled downtime
events
service_notification_commands notify-service-by-email ; send
service notifications via email
host_notification_commands notify-host-by-email ; send host
notifications via email
register 0 ; DONT
REGISTER THIS DEFINITION - ITS NOT A REAL CONTACT, JUST A TEMPLATE!
}
define contact{
contact_name astuck
use generic-contact
alias SysAdmin1
email {my email}
}
define contactgroup{
contactgroup_name admins
alias SysAdmins
members astuck
}
define host{
name generic-host ; The name of this
host template
notifications_enabled 1 ; Host notifications
are enabled
event_handler_enabled 1 ; Host event handler
is enabled
flap_detection_enabled 1 ; Flap detection is
enabled
failure_prediction_enabled 1 ; Failure prediction
is enabled
process_perf_data 1 ; Process
performance data
retain_status_information 1 ; Retain status
information across program restarts
retain_nonstatus_information 1 ; Retain non-status
information across program restarts
notification_period 24x7 ; Send host
notifications at any time
register 0 ; DONT REGISTER THIS
DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
}
define host{
name generic-linux
use generic-host
check_period 24x7
check_interval 5
retry_interval 1
max_check_attempts 10
check_command check-host-alive
notification_interval 120
notification_options d,u,r
register 0
}
define host{
name nonprod
use generic-linux
contact_groups admins
register 0
}
define host{
use nonprod
host_name lithium
alias Oracle Dev 2
address lithium
}
As far as I see it I should get all host/service notification 24/7. However,
when I reboot 'lithium' I get a host down notification but when it comes
back
I don't get anything.
I turned on notification debugging :
[1181695731.149796:032.0] ** Host Notification Attempt ** Host: 'lithium',
Type: 0, Current State: 1, Last Notification: Wed Dec 31 16:00:00 1969
[1181695731.149852:032.0] Notification viability test passed.
[1181695731.149861:032.1] Current notification number: 1
[1181695731.149867:032.2] Creating list of contacts to be notified.
[1181695731.149873:032.1] Host notification will NOT be escalated.
[1181695731.149879:032.2] Adding contact 'astuck' to notification list.
[1181695731.149985:032.2] ** Attempting to notifying contact 'astuck'...
[1181695731.149994:032.2] ** Checking host notification viability for
contact 'astuck'...
[1181695731.150005:032.2] ** Host notification viability for contact
'astuck' PASSED.
[1181695731.150014:032.2] ** Notifying contact 'astuck'
[1181695731.150071:032.2] Raw Command: /usr/bin/printf "%b" "***** Nagios
*****\n\nNotification Type: $NOTIFICATIONTYPE$\nHost: $HOSTNAME$\nState:
$HOSTSTATE$\nAddress: $HOSTADDRESS$\nInfo: $HOSTOUTPUT$\n\nDate/Time:
$LONGDATETIME$\n" | /bin/mail -s "** $NOTIFICATIONTYPE$ Host Alert:
$HOSTNAME$ is $HOSTSTATE$ **" $CONTACTEMAIL$
[1181695731.150078:032.2] Processed Command: /usr/bin/printf "%b" "*****
Nagios *****\n\nNotification Type: PROBLEM\nHost: lithium\nState:
DOWN\nAddress: lithium\nInfo: (No output returned from host
check)\n\nDate/Time: Tue Jun 12 17:48:51 PDT 2007\n" | /bin/mail -s "**
PROBLEM Host Alert: lithium is DOWN **" {my email}
[1181695731.194505:032.0] No contacts were notified. Next possible
notification time: Tue Jun 12 19:48:51 2007
[1181695731.194527:032.0] 1 contacts were notified.[1181695741.047809:032.0]
** Host Notification Attempt ** Host: 'lithium', Type: 0, Current State: 1,
Last Notification: Tue Jun 12 17:48:51 2007
[1181695741.047834:032.1] Its not yet time to re-notify the contacts about
this host problem...
[1181695741.047843:032.1] Next acceptable notification time: Tue Jun 12
19:48:51 2007
[1181695741.047850:032.0] Notification viability test failed. No
notification will be sent out.
[1181695751.160027:032.0] ** Host Notification Attempt ** Host: 'lithium',
Type: 0, Current State: 1, Last Notification: Tue Jun 12 17:48:51 2007
[1181695751.160058:032.1] Its not yet time to re-notify the contacts about
this host problem...
[1181695751.160068:032.1] Next acceptable notification time: Tue Jun 12
19:48:51 2007
[1181695751.160074:032.0] Notification viability test failed. No
notification will be sent out.
[1181695811.210449:032.0] ** Host Notification Attempt ** Host: 'lithium',
Type: 0, Current State: 1, Last Notification: Tue Jun 12 17:48:51 2007
[1181695811.210479:032.1] Its not yet time to re-notify the contacts about
this host problem...
[1181695811.210489:032.1] Next acceptable notification time: Tue Jun 12
19:48:51 2007
[1181695811.210495:032.0] Notification viability test failed. No
notification will be sent out.
[1181695821.068538:032.0] ** Host Notification Attempt ** Host: 'lithium',
Type: 0, Current State: 1, Last Notification: Tue Jun 12 17:48:51 2007
[1181695821.068569:032.1] Its not yet time to re-notify the contacts about
this host problem...
[1181695821.068580:032.1] Next acceptable notification time: Tue Jun 12
19:48:51 2007
[1181695821.068586:032.0] Notification viability test failed. No
notification will be sent out.
[1181695821.068895:032.0] ** Host Notification Attempt ** Host: 'lithium',
Type: 0, Current State: 1, Last Notification: Tue Jun 12 17:48:51 2007
[1181695821.068915:032.1] Its not yet time to re-notify the contacts about
this host problem...
[1181695821.068924:032.1] Next acceptable notification time: Tue Jun 12
19:48:51 2007
[1181695821.068931:032.0] Notification viability test failed. No
notification will be sent out.
[1181695831.174383:032.0] ** Host Notification Attempt ** Host: 'lithium',
Type: 0, Current State: 1, Last Notification: Tue Jun 12 17:48:51 2007
[1181695831.174418:032.1] Its not yet time to re-notify the contacts about
this host problem...
[1181695831.174427:032.1] Next acceptable notification time: Tue Jun 12
19:48:51 2007
[1181695831.174434:032.0] Notification viability test failed. No
notification will be sent out.
[1181695831.174731:032.0] ** Host Notification Attempt ** Host: 'lithium',
Type: 0, Current State: 1, Last Notification: Tue Jun 12 17:48:51 2007
[1181695831.174745:032.1] Its not yet time to re-notify the contacts about
this host problem...
[1181695831.174754:032.1] Next acceptable notification time: Tue Jun 12
19:48:51 2007
[1181695831.174760:032.0] Notification viability test failed. No
notification will be sent out.
[1181695851.144314:032.0] ** Host Notification Attempt ** Host: 'lithium',
Type: 0, Current State: 1, Last Notification: Tue Jun 12 17:48:51 2007
[1181695851.144338:032.1] Its not yet time to re-notify the contacts about
this host problem...
[1181695851.144347:032.1] Next acceptable notification time: Tue Jun 12
19:48:51 2007
[1181695851.144354:032.0] Notification viability test failed. No
notification will be sent out.
[1181696025.034559:032.0] ** Service Notification Attempt ** Host:
'lithium', Service: 'DISK USAGE /tmp', Type: 0, Current State: 0, Last
Notification: Wed Dec 31 16:00:00 1969
[1181696025.034582:032.1] We shouldn't notify about this recovery.
[1181696025.034589:032.0] Notification viability test failed. No
notification will be sent out.
[1181696031.130428:032.0] ** Service Notification Attempt ** Host:
'lithium', Service: 'LOAD', Type: 0, Current State: 0, Last Notification:
Wed Dec 31 16:00:00 1969
[1181696031.130452:032.1] We shouldn't notify about this recovery.
[1181696031.130460:032.0] Notification viability test failed. No
notification will be sent out.
[1181696031.131081:032.0] ** Service Notification Attempt ** Host:
'lithium', Service: 'DISK USAGE /usr/local', Type: 0, Current State: 0, Last
Notification: Wed Dec 31 16:00:00 1969
[1181696031.131095:032.1] We shouldn't notify about this recovery.
[1181696031.131102:032.0] Notification viability test failed. No
notification will be sent out.
[1181696111.052735:032.0] ** Service Notification Attempt ** Host:
'lithium', Service: 'CFENVD', Type: 0, Current State: 0, Last Notification:
Wed Dec 31 16:00:00 1969
[1181696111.052759:032.1] We shouldn't notify about this recovery.
[1181696111.052766:032.0] Notification viability test failed. No
notification will be sent out.
[1181696111.052971:032.0] ** Service Notification Attempt ** Host:
'lithium', Service: 'PERC CONTROLLER', Type: 0, Current State: 0, Last
Notification: Wed Dec 31 16:00:00 1969
[1181696111.052984:032.1] We shouldn't notify about this recovery.
[1181696111.052992:032.0] Notification viability test failed. No
notification will be sent out.
[1181696111.053334:032.0] ** Service Notification Attempt ** Host:
'lithium', Service: 'CFEXECD', Type: 0, Current State: 0, Last Notification:
Wed Dec 31 16:00:00 1969
[1181696111.053348:032.1] We shouldn't notify about this recovery.
[1181696111.053355:032.0] Notification viability test failed. No
notification will be sent out.
[1181696121.163710:032.0] ** Service Notification Attempt ** Host:
'lithium', Service: 'MEM', Type: 0, Current State: 0, Last Notification: Wed
Dec 31 16:00:00 1969
[1181696121.163738:032.1] We shouldn't notify about this recovery.
[1181696121.163746:032.0] Notification viability test failed. No
notification will be sent out.
[1181696121.163984:032.0] ** Service Notification Attempt ** Host:
'lithium', Service: 'DISK USAGE /var', Type: 0, Current State: 0, Last
Notification: Wed Dec 31 16:00:00 1969
[1181696121.163998:032.1] We shouldn't notify about this recovery.
[1181696121.164005:032.0] Notification viability test failed. No
notification will be sent out.
[1181696141.130999:032.0] ** Service Notification Attempt ** Host:
'lithium', Service: 'DISK USAGE /', Type: 0, Current State: 0, Last
Notification: Wed Dec 31 16:00:00 1969
[1181696141.131023:032.1] We shouldn't notify about this recovery.
[1181696141.131031:032.0] Notification viability test failed. No
notification will be sent out.
Clearly, nagios decided that I shouldn't get a host up notification. I just
don't understand why. From the log files I'd say the following logic takes
place :
1. Host goes down - service check fails
2. Nagios checks to see if host is down - YES
3. Because of step 2. no service notifications are sent
4. Host down notification is sent instead
5. Host comes back
6. Service checks start recovering - no service recovery notification is
sent since no service problem notifications were sent in the first place.
7. Host is assumed to be up since service is up
8. Hence - no host up notification.
First I thought my host up notification might not make it through one of the
notification filters but according to the log there is NO HOST check after
the reboot therefore
there is no host notification attempt.
Looks to me like a design bug but I wanna make sure I'm not getting this
wrong. It just doesn't make sense to me that I wouldn't be notified
about a host coming back. I understand the part about the services.
INTERESTING: I have rebooted a few times and it appears that sometimes I do
get host up notifications but most of the time I don't so it seems to have
to do with
when exactly the reboot occurs.
Also, I turned off flapping globally but no difference.
Anyone seen this behaviour ?
--
stucky
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20070612/a4088a2b/attachment.html>
-------------- next part --------------
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list