Setting up a passive check problem
Marc Powell
marc at ena.com
Wed Apr 13 01:09:17 CEST 2005
> -----Original Message-----
> From: Lewis Getschel [mailto:lgetschel at denver.westerngeco.slb.com]
> Sent: Tuesday, April 12, 2005 5:08 PM
> To: Marc Powell
> Cc: Nagios Users
> Subject: Re: [Nagios-users] Setting up a passive check problem
>
> Sorry to describe so much and then leave out my actual problem...
>
> Being an impatient person I've changed my services.cfg a little... now
> they are:
>
> services.cfg:
> define service{
> use linux-service
> name ibm_disk_array_status
> service_description ibm_disk_array_status
> active_checks_enabled 0
> passive_checks_enabled 1
> check_command check_dummy
> check_freshness 0
> register 0
> }
>
> same config- hosts.cfg:
> # service definition
> define service{
> use ibm_disk_array_status
> host_name fs004,fs005,fs006,fs007,fs008
> }
>
> commands.cfg:
> # 'check_dummy' command definition
> define command{
> command_name check_dummy
> command_line $USER1$/check_dummy 0
> }
Yup. Still looks ok.
>
> Now, If I understand ...
> the idea of "active_checks_enabled 0", means do NOT
> actually check anything (don't run the command_line defined).
> the idea of "passive_checks_enabled 1" means that nagios
> will only get updates that I put into the command_file
> ("/usr/local/nagios/var/rw/nagios.cmd") through another script that is
Correct. Freshness checking will ignore the value of
active_checks_enabled I believe. That would only come into play if
you've enabled freshness checking of course.
> called. This much IS working because I see the following line in my
> event log:
> [04-12-2005 14:57:15] EXTERNAL COMMAND:
> PROCESS_SERVICE_CHECK_RESULT;fs008;ibm_disk_array_status;0;OK - No
> errors reported
>
>
This indicates that nagios saw an external command, not necessarily that
it accepted it. I'm going to guess it did as the next line would have
been an error of some type if nagios rejected it.
> When I look at the scheduling queue it shows that my service
> "ibm_disk_array_status" is scheduled to be run!
> fs004 ibm_disk_array_status 04-12-2005 14:34:16 04-12-2005
> 14:54:16 ENABLED
>
> When I view my fileserver services, it shows:
> fs004 ibm_disk_array_status OK 04-12-2005 14:34:16 0d 1h 33m
> 37s 1/4 Status is OK
>
> The problem is that the "Status is OK" message is coming from the
> check_dummy command, and it _SHOULD_ be "OK - No errors reported" as
my
> external command shows.
This could be explained if you have state retention enabled in
nagios.cfg. See the notes on Retention at
http://nagios.sourceforge.net/docs/1_0/xodtemplate.html.
>
> ------------I've done the following commands:---------------
> $ sudo /etc/rc.d/init.d/nagios stop
> Stopping network monitor: nagios
> $ ps -ef | grep nagios | grep -v grep
> $ sudo /etc/rc.d/init.d/nagios start
> Starting network monitor: nagios
> PID TTY TIME CMD
> 30767 ? 00:00:00 nagios
> $ ps -ef | grep nagios | grep -v grep
> nagios 30767 1 8 15:05 ? 00:00:00
> /usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
> $
>
-----------------------------------------------------------------------
> So I don't have an extra copy of nagios running.
Good thinking. It's a common problem.
> Here is what I want to happen:
> 1) tell nagios to accept passive results for these 5 servers, display
> the last known status value it had for the service
Looks like you've got that configured properly.
> 2) don't perform any active checks for whatever I need to specify as a
> command
Again, it looks like you have that configured properly.
> 3) When my script places a status of OK, or CRITICAL (the only 2
cases),
> accept that as the new status value, and notify as appropriate
> until/unless the status is changed or the service is acknowledged.
This will happen as a natural occurrence of submitting passive checks.
> 4) repeat
>
> After all this time, I thought I understood the basic operation of
> Nagios, but it doesn't seem that I do.
You're close. I'll bet it's state retention that's throwing you, based
on the information so far.
> (If someone has example configs for a passive service, could you
please
> post your file entries so I can see how someone else does it)
Here's how I do it. Note that I have active checks enabled but the
check_period to none. That prevents the annoying X from being displayed
in the GUI but the command still never gets run as an active check.
# Generic service definition template
define service{
name generic-service
active_checks_enabled 1 ; Active service checks
are enabled
passive_checks_enabled 1 ; Passive service checks
are enabled/accepted
parallelize_check 1 ; Active service checks
should be parallelized
obsess_over_service 0 ; We should obsess over
this service (if necessary)
check_freshness 0 ; Default is to NOT
check service 'freshness'
notifications_enabled 1 ; Service notifications
are enabled
event_handler_enabled 1 ; Service event handler
is enabled
flap_detection_enabled 1 ; Flap detection is
enabled
process_perf_data 0 ; Process performance
data
retain_status_information 1 ; Retain status
information across program restarts
retain_nonstatus_information 1 ; Retain non-status
information across program restarts
is_volatile 0
check_period none
max_check_attempts 4
normal_check_interval 5
retry_check_interval 3
notification_interval 10080
notification_period none
notification_options c,r
register 0 ; DONT REGISTER THIS
DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
}
# Host definition
define host {
use generic-host
host_name host-name
alias The Renaissance Center
address <ip address removed>
}
#Service definition
define service {
use generic-service
host_name host-name
service_description PING
contact_groups tnops
check_command check_ping
}
# 'check_ping' command definition
define command{
command_name check_ping
command_line $USER1$/check_ping $HOSTADDRESS$ 30 60 500.0
1000.0 -p 10 -t 30
}
--
Marc
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id396&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list