Inexplicable service escalation behaviour
Ralph.Grothe at itdz-berlin.de
Ralph.Grothe at itdz-berlin.de
Thu Oct 6 16:08:13 CEST 2005
Dear List Subscribers,
although I have already asked how to properly set up an
escalation scheme
a couple of weeks ago here (sorry, only find time to continue
Nagios fumbling
every now and then at work, and at home it would be useless due
to lacking testing ground
(ok, I could emulate a host and network farm by vmware or xen
etc., but that's too much fuss))
I desperately need further assistance.
I am not getting on with this.
I swear that I've read carefully the sections on escalations in
the Nagios docs at least thrice
by now.
And the presented examples in the docs sound very convincing to
me
(though a bit far-fetched) so that I very well can gather, I
suppose, how it should work - in *theory*.
My objective seems very trivial to me.
I just want Nagios to send a *single* notification by using my
"file-service-sc-ticket"
(misc)command definition to our trouble ticketing system,
but at the same time keep continuing sending out repetetive
notifications to the
various admin recipients at the common notification intervall
(at least the latter is working).
The filing of the ticket works great.
In fact too great, as it turns out to be a flooding of the
service center.
Tickets keep being generated at common notification intervals,
even for recovery alerts
(which I never intended).
Also are tickets generated for downed hosts I wouldn't have
thought to be able
(per my Nagios definitions) to send a ticketing request to the
service center.
I wonder what the host_name directive in the serviceescalation
definition is any use
for if tickets are filed for other hosts despite?
A hostescalation definition so far doesn't exist yet.
I deliberately restricted it to a fumbling host called "fiddle"
until I get this trivial task working, whereafter I would of
course extend it to all my
monitored hosts and services.
So this is the only escalation definition so far:
$ cat escalations.cfg
define serviceescalation {
host_name fiddle
service_description icmp-host-alive
first_notification 3
last_notification 3
notification_interval 0
contact_groups service_center
}
This is the above service:
define service {
use generic-service
service_description icmp-host-alive
hostgroup_name non_fwalled_hosts
check_command check-host-alive
contact_groups nagiosadmin,service_center
}
This is the inherited service template:
define service {
name generic-service
is_volatile 0
max_check_attempts 5
normal_check_interval 5
retry_check_interval 3
check_period 24x7
active_checks_enabled 1
passive_checks_enabled 0
parallelize_check 1
obsess_over_service 0
check_freshness 0
event_handler notify-by-email
event_handler_enabled 0
flap_detection_enabled 0
process_perf_data 0
retain_status_information 1
retain_nonstatus_information 1
notification_interval 30
notification_period 24x7
notification_options w,u,c,r
notifications_enabled 1
contact_groups nagiosadmin
register 0
}
This is the host definition for fiddle:
define host {
use generic-host
host_name fiddle
alias MC/SG Cluster Package
FIDDLE
address 123.123.123.123
hostgroups non_fwalled_hosts
contact_groups nagiosadmin
}
This is the contact group definition receiving the tickets (i.e.
service center)
define contactgroup {
contactgroup_name service_center
alias Service Center TT Filer Accounts
members scadmin
}
And finally this is the contact (inclusive template, but with
bogus mail address here):
define contact {
name generic-contact
register 0
contact_name grothe
alias Must be overridden
contactgroups sazadmin
host_notification_period workhrs
service_notification_period workhrs
host_notification_options d,u,r
service_notification_options w,u,c,r
host_notification_commands host-notify-by-email
service_notification_commands notify-by-email
email nagios
}
define contact {
use generic-contact
contact_name scadmin
alias Service Center TT Filer
email scadmin at our.rotten.com
host_notification_period 24x7
service_notification_period 24x7
host_notification_commands file-host-sc-ticket
service_notification_commands file-service-sc-ticket
address1 SC Token
address2 Another SC Token
}
I think I can skip the command definition for
"file-service-sc-ticket" here
(I surely know by the sheer ticket flood that at least this part
is doing its duty as expected)
I am absolutely clueless why the service center is receiving
those ticket filing requests
repetitvely, and even from other hosts of host group
"non_fwalled_hosts" when I did
in fact specify host fiddle in the service escalation definition
(something I would consider a clear disambiguator directive).
If I can't get this trivial but important functionality of ticket
generation working
I will have to dismiss the whole Nagios experience and look out
for another tool,
which I think would be a very sad thing, given the time spent so
far and the positive
impressions from the working parts.
P.S. I don't know if this is of any importance at all, but these
are the releases I run:
$ printf "%s\n\n" "$(uname -srv)";/opt/sw/nagios/bin/nagios
-V|head -5
AIX 3 4
Nagios 2.0b3
Copyright (c) 1999-2005 Ethan Galstad (www.nagios.org)
Last Modified: 04-03-2005
License: GPL
Many thanks for your kind notice
Ralph
-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list