Hosts and Services
craig cook
craig.cook at ncmail.net
Wed Jun 22 17:02:36 CEST 2005
My proposal is to change the host to service relationship setup.
Instead of two record types, make it into three.
Service
ping-linux-server
Host
linux-server1
and
host_service
linux-server1
ping-linux-server
A host can have many services.
A service can have many hosts
In a many to many relationship, you create a third relationship
containing both items.
So you would have something like this:
define host {
host_name linux-server1
address 10.0.0.1
parent ...
notes ...
notes_url ...
icon_image ...
icon_image_alt ...
vrml_image ...
statusmap_image ...
2d_coords ...
3d_coords ...
}
define command{
service ping-server
description ping test a server
check_command_line ping $HOSTNAME$
timeout 10
}
define service {
service ping-linux-server
description ping test a linux server
check_command ping-server
normal_check_interval 5
freshness_threshold 30
Active_enabled 1
passive_enabled 0
parallelize 1
freshness_enabled 1
notes ...(generic instructions about what to do if ping fails – if the
host-service record does not contain details, info from here could be
used)
notes_url ...
}
define service {
service ping-windows-server
description ping test a windows server
check_command ping-server
normal_check_interval 4
freshness_threshold 20
Active_enabled 1
passive_enabled 0
parallelize 1
freshness_enabled 1
notes ...(generic instructions about what to do if ping fails – if the
host-service record does not contain details, info from here could be
used)
notes_url ...
}
define service {
service smtp-linux-servers
check_command <smtp command>
etc...
}
define host_service{ # only have things directly related to this
particular host-service combination
host linux-server1
service ping-linux-server
alias check linux-server is responding to a ping
notes ...(what to do if ping fails on this particular host)
notes_url ...
icon_image ...
icon_image_alt ...
action_url ...
statusmap_image ...
retain_staus_info
retain_non_status_info
process_perf_data
event_handler
event_handler_enabled
flap_threshold_low
flap_threshold_high
flap_threshold_enabled
notification_enabled
notification_command
notification_options
notification_retry_interval
notification_contact_group
execute_on_failure 1
}
Now, you can reuse the service definition for more hosts.
define host_service{
host linux-server2
service ping-linux-server
etc ...
}
define host_service{
host linux-server3
service ping-linux-server
}
You do not have to store all the service information again, against
every host.
Now if I decide my normal test interval needs to be changed, I change
one entry. Or, currently, I would craft a search and replace command to
do it. The same end result can be achieved, but if things are split
more, it is easier to accomplish.
In theory, the notification details should be removed into their own
section as well.
define notification{
notification_name email-linux-admins
notification_command mail ...
notification_options whatever
notification_retry_interval 5
notification_contact_group linux-admins
}
so the host_service definition becomes
define host_service{ # only have things directly related to this
particular host-service combination
host linux-server
service ping-linux-server
alias check linux-server is responding to a ping
notes ...(what to do if ping fails on this particular host)
notes_url ...
icon_image ...
icon_image_alt ...
action_url ...
statusmap_image ...
retain_staus_info
retain_non_status_info
process_perf_data
event_handler
event_handler_enabled
flap_threshold_low
flap_threshold_high
flap_threshold_enabled
notification_enabled
notification_name email-linux-admins
}
If a “execute_on_failure” flag is added to the host-service record, it
would actually allow more than one test if a service on a host failed.
Nagios would need to do a lookup to work out which tests to run though.
So if a standard “check-host-alive” command used to be added to the host
record, you can simply add
execute_on_failure 1
to the host-service record instead. Under normal operation the ping of
the host happens, if a service failure is detected, the ping is executed
again, following standard nagios rules/logic. This would save defining
“service” information with the host record.
The end result of my proposed change would make it easier to define
host, service and their relationship records. The functionality of
nagios would not be changed (unless you adopt the “execute_on_failure”
idea).
These ideas come from relational database design theory. In reality,
different designs may be implemented for performance or other reasons.
Craig
-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_idt77&alloc_id492&op=click
More information about the Developers
mailing list