Setting up a passive check problem
Lewis Getschel
lgetschel at denver.westerngeco.slb.com
Wed Apr 13 00:08:11 CEST 2005
Sorry to describe so much and then leave out my actual problem...
Being an impatient person I've changed my services.cfg a little... now
they are:
services.cfg:
define service{
use linux-service
name ibm_disk_array_status
service_description ibm_disk_array_status
active_checks_enabled 0
passive_checks_enabled 1
check_command check_dummy
check_freshness 0
register 0
}
same config- hosts.cfg:
# service definition
define service{
use ibm_disk_array_status
host_name fs004,fs005,fs006,fs007,fs008
}
commands.cfg:
# 'check_dummy' command definition
define command{
command_name check_dummy
command_line $USER1$/check_dummy 0
}
Now, If I understand ...
the idea of "active_checks_enabled 0", means do NOT
actually check anything (don't run the command_line defined).
the idea of "passive_checks_enabled 1" means that nagios
will only get updates that I put into the command_file
("/usr/local/nagios/var/rw/nagios.cmd") through another script that is
called. This much IS working because I see the following line in my
event log:
[04-12-2005 14:57:15] EXTERNAL COMMAND:
PROCESS_SERVICE_CHECK_RESULT;fs008;ibm_disk_array_status;0;OK - No
errors reported
When I look at the scheduling queue it shows that my service
"ibm_disk_array_status" is scheduled to be run!
fs004 ibm_disk_array_status 04-12-2005 14:34:16 04-12-2005
14:54:16 ENABLED
When I view my fileserver services, it shows:
fs004 ibm_disk_array_status OK 04-12-2005 14:34:16 0d 1h 33m
37s 1/4 Status is OK
The problem is that the "Status is OK" message is coming from the
check_dummy command, and it _SHOULD_ be "OK - No errors reported" as my
external command shows.
------------I've done the following commands:---------------
$ sudo /etc/rc.d/init.d/nagios stop
Stopping network monitor: nagios
$ ps -ef | grep nagios | grep -v grep
$ sudo /etc/rc.d/init.d/nagios start
Starting network monitor: nagios
PID TTY TIME CMD
30767 ? 00:00:00 nagios
$ ps -ef | grep nagios | grep -v grep
nagios 30767 1 8 15:05 ? 00:00:00
/usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
$
-----------------------------------------------------------------------
So I don't have an extra copy of nagios running.
Here is what I want to happen:
1) tell nagios to accept passive results for these 5 servers, display
the last known status value it had for the service
2) don't perform any active checks for whatever I need to specify as a
command
3) When my script places a status of OK, or CRITICAL (the only 2 cases),
accept that as the new status value, and notify as appropriate
until/unless the status is changed or the service is acknowledged.
4) repeat
After all this time, I thought I understood the basic operation of
Nagios, but it doesn't seem that I do.
(If someone has example configs for a passive service, could you please
post your file entries so I can see how someone else does it)
Thanks,
Marc Powell wrote:
>
>
>>-----Original Message-----
>>From: nagios-users-admin at lists.sourceforge.net [mailto:nagios-users-
>>admin at lists.sourceforge.net] On Behalf Of Lewis Getschel
>>Sent: Tuesday, April 12, 2005 11:42 AM
>>To: Nagios Users
>>Subject: [Nagios-users] Setting up a passive check problem
>>
>>All-
>> After 8 months of tweaking our 1.2 system with active checks (that
>>work fine), I now find myself at a loss to setup a passive "service
>>check".
>>
>>I have 5 file servers in a "farm" that log themselves to a single
>>
>>
>syslog
>
>
>>file.
>>I wrote a script that deals with that and can submit the passive
>>
>>
>result
>
>
>>to Nagios to be processed.
>>
>>My problem _seems_ to be my understanding of the basic setup for a
>>passive service check.
>>The docs say: "...service checks to Nagios, a service must have
>>
>>
>already
>
>
>>been defined in the object configuration file
>><http://nagios.sourceforge.net/docs/1_0/configobject.html>"
>>
>>
>
>This means that when you submit an entry to the command file, there must
>be a matching host_name and service_description that nagios already
>knows about or it will be ignored.
>
>
>
>
>>What "check_command" does a passive service "need"? (it needs a
>>command???) I don't want nagios to _DO_ anything, just accept the
>>passive results from another process.
>>
>>When I tried to leave a check_command out, nagios complains "... check
>>command is NULL"
>>
>>
>
>As you can see, there must be one defined. What it is depends on if
>you're going to be using active checks or freshness checking or not. If
>you are going to be using them then the command must be valid as nagios
>will actively execute it to determine the state of the service at the
>expiration of the freshness interval.
>
>If you are not using freshness checking than the command can be anything
>you like. I use the same command that is executed on my distributed
>servers for consistency but it could be check_dummy or any other command
>as it will never actually be run.
>
>
>
>
>
>>services.cfg:
>>define service{
>> use linux-service
>> name ibm_disk_array_status
>> service_description ibm_disk_array_status
>> active_checks_enabled 0
>> passive_checks_enabled 1
>> check_command check_passive_disklog
>> register 0
>> }
>>
>>commands.cfg:
>># 'ibm_disk_array_status' command definition
>>define command{
>>command_name check_passive_disklog
>>command_line $USER1$/check_passive_disklog
>> }
>>
>>hosts.cfg:
>>define service{
>> use ibm_disk_array_status
>> host_name fs004,fs005,fs006,fs007,fs008
>>}
>>
>>
>>
>
>I haven't used this type of construct personally but it looks fine.
>
>
>
>>Can someone point out where I'm going wrong to simply allow a service
>>status to be accepted passively, please.
>>
>>
>
>Instead of making an assumption about what your problem is, why don't
>you tell us the symptoms and error messages that you are seeing?
>
>--
>Marc
>
>
>
-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list