Nagios Notifications not being Sent Out
Goutos, Kevin
kgoutos at libertymgt.com
Fri Oct 16 16:31:27 CEST 2009
Thanks for the reply Marc.
I do have some good news. I did receive a notification last night for a
flapping alert. However, it still is not sending out alerts from being
down, returning to up state, etc...
[1255678430] SERVICE FLAPPING ALERT: AUSTIN-LAPTOP;CPU Load;STOPPED;
Service appears to have stopped flapping (4.0% change < 5.0% threshold)
[1255678430] SERVICE NOTIFICATION: libertyadmins;AUSTIN-LAPTOP;CPU
Load;FLAPPINGSTOP (CRITICAL);notify-service-by-email;No route to host
If I look at this portion of the log....It seems there should be
notifications sent out, but none were..
[1255547104] HOST ALERT: CORP-NSS;DOWN;SOFT;1;CRITICAL - Host
Unreachable (IP ADDRESS)
[1255547104] SERVICE ALERT: CORP-NSS;CPU Load;CRITICAL;HARD;1;No route
to host
[1255547104] SERVICE ALERT: CORP-NSS;NSClient++
Version;CRITICAL;HARD;1;No route to host
[1255547174] HOST ALERT: CORP-NSS;DOWN;SOFT;2;CRITICAL - Host
Unreachable (IP ADDRESS)
[1255547234] SERVICE ALERT: CORP-NSS;Ping Test;CRITICAL;HARD;1;CRITICAL
- Host Unreachable (IP ADDRESS)
[1255547244] HOST ALERT: CORP-NSS;DOWN;SOFT;3;CRITICAL - Host
Unreachable (IP ADDRESS)
[1255547314] HOST ALERT: CORP-NSS;DOWN;SOFT;4;CRITICAL - Host
Unreachable (IP ADDRESS)
[1255547384] HOST ALERT: CORP-NSS;DOWN;SOFT;5;CRITICAL - Host
Unreachable (IP ADDRESS)
[1255547454] HOST ALERT: CORP-NSS;DOWN;SOFT;6;CRITICAL - Host
Unreachable (IP ADDRESS)
[1255547474] SERVICE ALERT: CORP-NSS;Used Disk Space;CRITICAL;HARD;1;No
route to host
[1255547474] SERVICE ALERT: CORP-NSS;System Uptime;CRITICAL;HARD;1;No
route to host
[1255547524] HOST ALERT: CORP-NSS;DOWN;SOFT;7;CRITICAL - Host
Unreachable (IP ADDRESS)
[1255547584] SERVICE ALERT: CORP-NSS;Memory Usage;CRITICAL;HARD;1;No
route to host
[1255547594] HOST ALERT: CORP-NSS;DOWN;SOFT;8;CRITICAL - Host
Unreachable (IP ADDRESS)
[1255547664] HOST ALERT: CORP-NSS;DOWN;SOFT;9;CRITICAL - Host
Unreachable (IP ADDRESS)
[1255547734] HOST ALERT: CORP-NSS;DOWN;HARD;10;CRITICAL - Host
Unreachable (IP ADDRESS)
[1255556204] HOST ALERT: CORP-NSS;UP;HARD;1;PING OK - Packet loss = 0%,
RTA = 10.40 ms
[1255556234] SERVICE ALERT: CORP-NSS;Ping Test;OK;HARD;1;PING OK -
Packet loss = 0%, RTA = 7.97 ms
I did remove the w and c options and replaced them with 'd' where you
noted.
"I am assuming that the members of the libertyadminsgroup all use the
generic-contact template you provided with minimal modifications. It
wouldn't hurt to provide the complete object for this and a sample
problem service from both objects.cache and status.dat when the
problem occurs."
That is correct, the libertadminsgroup right now is just my E-mail
address and I'm using the generic contact template.
Here is something I don't understand...what I pasted below is from
objects.cache. Why does that show service and host notifications set as
only f,s? Below that I'll show what I have for contacts.cfg
Objects.cache
define contact {
contact_name libertyadmins
alias Liberty Admins
service_notification_period 24x7
host_notification_period 24x7
service_notification_options f,s
host_notification_options f,s
service_notification_commands notify-service-by-email
host_notification_commands notify-host-by-email
email kgoutos at libertymgt.com
host_notifications_enabled 1
service_notifications_enabled 1
can_submit_commands 1
retain_status_information 1
retain_nonstatus_information 1
}
Contacts.cfg
define contact{
contact_name libertyadmins
; Short name of user
use generic-contact
; Inherit default values from generic-contact template (defined above)
alias Liberty Admins ; Full
name of user
host_notifications_enabled 1
service_notifications_enabled 1
host_notification_commands notify-host-by-email
service_notification_commands notify-service-by-email
host_notification_period 24x7
service_notification_period 24x7
host_notification_options d,u,r,f,s
service_notification_options d,u,r,f,s
email kgoutos at libertymgt.com
; <<***** CHANGE THIS TO YOUR EMAIL ADDRESS ******
}
This is from status.dat
hoststatus {
host_name=AUSTIN-LAPTOP
modified_attributes=0
check_command=check-host-alive
check_period=24x7
notification_period=24x7
check_interval=1.000000
retry_interval=1.000000
event_handler=
has_been_checked=1
should_be_scheduled=1
check_execution_time=4.027
check_latency=0.301
check_type=0
current_state=0
last_hard_state=0
last_event_id=906
current_event_id=933
current_problem_id=0
last_problem_id=339
plugin_output=PING OK - Packet loss = 0%, RTA = 20.01 ms
long_plugin_output=
performance_data=rta=20.006001ms;3000.000000;5000.000000;0.000000
pl=0%;80;100;0
last_check=1255702500
next_check=1255702570
check_options=0
current_attempt=1
max_attempts=10
state_type=1
last_state_change=1255701660
last_hard_state_change=1255701660
last_time_up=1255702510
last_time_down=1255701600
last_time_unreachable=0
last_notification=0
next_notification=0
no_more_notifications=0
current_notification_number=0
current_notification_id=89769
notifications_enabled=1
problem_has_been_acknowledged=0
acknowledgement_type=0
active_checks_enabled=1
passive_checks_enabled=1
event_handler_enabled=1
flap_detection_enabled=1
failure_prediction_enabled=1
process_performance_data=1
obsess_over_host=1
last_update=1255702560
is_flapping=0
percent_state_change=4.54
scheduled_downtime_depth=0
}
This is from Nagios.log....I simply unplugged "AUSTIN-LAPTOP" (My test
machine)
[1255703030] SERVICE ALERT: AUSTIN-LAPTOP;System
Uptime;CRITICAL;HARD;1;No route to host
[1255703030] SERVICE ALERT: AUSTIN-LAPTOP;Memory
Usage;CRITICAL;HARD;1;No route to host
[1255703030] SERVICE ALERT: AUSTIN-LAPTOP;Ping
Test;CRITICAL;HARD;1;CRITICAL - Host Unreachable (192.168.95.232)
[1255703040] SERVICE ALERT: AUSTIN-LAPTOP;NSClient++
Version;CRITICAL;HARD;1;No route to host
[1255703040] SERVICE ALERT: AUSTIN-LAPTOP;Used Disk
Space;CRITICAL;HARD;1;No route to host
That's all I see in the log, nothing about notifications or anything.
- Verify that notifications are enabled program-wide in nagios.cfg.
I was able to confirm everything was enabled.
- Verify that it hasn't been disabled via the GUI (Program Status)
Also confirmed.
- Verify that notifications for the specific service haven't been
disabled via the GUI (click on them and look or look for them in
status.dat)
Confirmed.
- See the 'Query regarding Nagios notification' thread from yesterday
so we don't have to repeat further.
I reveiwed this thread, double checked everything he was having trouble
with.
Thank you very much for the help! Please let me know if you need any
other information!
-----Original Message-----
From: Marc Powell [mailto:marc at ena.com]
Sent: Thursday, October 15, 2009 5:36 PM
To: Nagios-users at lists.sourceforge.net users
Subject: Re: [Nagios-users] Nagios Notifications not being Sent Out
On Oct 15, 2009, at 3:40 PM, Goutos, Kevin wrote:
> Hello all,
Hello.
> That shows a test host I've been using, I don't see anything in
> there about sending out a notification though..don't know if I
> should be. ..
>
>
> Oct 15 16:11:50 nagios nagios: SERVICE ALERT: AUSTIN-LAPTOP;CPU
> Load;OK;HARD;1;CPU Load 2% (5 min average)
> Oct 15 16:11:50 nagios nagios: SERVICE ALERT: AUSTIN-LAPTOP;Memory
> Usage;OK;HARD;1;Memory usage: total:2440.61 Mb - used: 402.64 Mb
> (16%) - free: 2037.97 Mb (84%)
> Oct 15 16:11:50 nagios nagios: SERVICE ALERT: AUSTIN-LAPTOP;NSClient+
> + Version;OK;HARD;1;NSClient++ 0.3.6.818 2009-06-14
> Oct 15 16:11:50 nagios nagios: SERVICE ALERT: AUSTIN-LAPTOP;System
> Uptime;OK;HARD;1;System Uptime - 1 day(s) 18 hour(s) 57 minute(s)
> Oct 15 16:11:50 nagios nagios: SERVICE ALERT: AUSTIN-LAPTOP;Used
> Disk Space;OK;HARD;1;c:\ - total: 37.26 Gb - used: 21.69 Gb (58%) -
> free 15.57 Gb (42%)
> Oct 15 16:12:00 nagios nagios: SERVICE ALERT: AUSTIN-LAPTOP;Ping
> Test;OK;HARD;1;PING OK - Packet loss = 0%, RTA = 3.62 ms
> Oct 15 16:12:00 nagios nagios: HOST ALERT: AUSTIN-LAPTOP;UP;HARD;
> 1;PING OK - Packet loss = 0%, RTA = 0.55 ms
These are all OK states. Do you have any examples of non-OK states
when you expect a notification to have been sent? Please also provide
a few entries prior to the state change and all the way through from
SOFT to HARD state.
> define contact{
> name generic-contact ; The
> name of this contact template
>
> host_notification_options w,u,c,r,f,s ;
> send notifications for all host states, flapping events, and
> scheduled downtime events
w and c are not valid for host_notification_options. You want 'd'
instead.
> define host{
> name generic-host
> notification_options w,u,c,r,f,s ;
> Only send notifications for specific host states
w and c are not valid for host notification_options. You want 'd'
instead. Check all your host templates.
> contact_groups libertyadminsgroup ;
> Notifications get sent to the admins by default
> register 0 ; DONT
> REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
> }
I am assuming that the members of the libertyadminsgroup all use the
generic-contact template you provided with minimal modifications. It
wouldn't hurt to provide the complete object for this and a sample
problem service from both objects.cache and status.dat when the
problem occurs.
- Verify that notifications are enabled program-wide in nagios.cfg
- Verify that it hasn't been disabled via the GUI (Program Status)
- Verify that notifications for the specific service haven't been
disabled via the GUI (click on them and look or look for them in
status.dat)
- See the 'Query regarding Nagios notification' thread from yesterday
so we don't have to repeat further.
--
Marc
------------------------------------------------------------------------
------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and
stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when
reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list