Understanding Passive Checks
Cliff Riggs
cliff at proteris.com
Tue Mar 30 23:52:44 CEST 2004
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hello,
With some research and help from the list I got my passive checks using
NSCA to work. I'm now trying to interpret the results as I'm seeing
quite a bit of flapping.
I have freshness checking enabled on the primary as follows:
check_service_freshness=1
freshness_check_interval=600
command_check_interval=-1
use_retained_program_state=0
retain_state_information=1
and I have it configured to use the freshness checking commands as
described here: http://nagios.sourceforge.net/docs/1_0/distributed.html
The primary service definition defines the check_command as
"service_is_stale" while the remote cisco-test service definition
defines the check_command as "check_ping!100.0,20%!500.0,60%" On the
primary system "active_checks_enabled 0" as well.
The Nagios event log looks like this:
[03-30-2004 16:25:25] SERVICE ALERT: cisco-test;PING;OK;HARD;3;PING OK
- - Packet loss = 0%, RTA = 3.48 ms
[03-30-2004 16:25:25] SERVICE ALERT:
cisco-test;PING;CRITICAL;HARD;3;CRITICAL: Service results are stale!
[03-30-2004 16:25:19] EXTERNAL COMMAND:
PROCESS_SERVICE_CHECK_RESULT;cisco-test;PING;0;PING OK - Packet loss =
0%, RTA = 3.48 ms
[03-30-2004 16:24:25] SERVICE ALERT:
cisco-test;PING;CRITICAL;SOFT;2;CRITICAL: Service results are stale!
[03-30-2004 16:23:25] SERVICE ALERT:
cisco-test;PING;CRITICAL;SOFT;1;CRITICAL: Service results are stale!
[03-30-2004 16:22:25] SERVICE ALERT: cisco-test;PING;OK;SOFT;2;PING OK
- - Packet loss = 0%, RTA = 3.45 ms
[03-30-2004 16:22:25] SERVICE ALERT:
cisco-test;PING;CRITICAL;SOFT;1;CRITICAL: Service results are stale!
[03-30-2004 16:22:18] EXTERNAL COMMAND:
PROCESS_SERVICE_CHECK_RESULT;cisco-test;PING;0;PING OK - Packet loss =
0%, RTA = 3.45 ms
[03-30-2004 16:19:25] SERVICE ALERT: cisco-test;PING;OK;HARD;3;PING OK
- - Packet loss = 0%, RTA = 3.34 ms
As a result, the service is flapping, but the external command check is
being correctly received. From what I understand from the timing, the
external check is being received, but not read in a timely fashion? Is
there something I am missing in this equation? I am especially confused
by the timing. It looks like it is using a 60 second check for
freshness (which was the default that I changed it from). Either that
or the timing is coincidental as this worked fine for the first 21
minutes or so and then started flapping 22 min and 15 seconds after
restart.
Thanks as usual for your insights!
Cliff
- --
- --------------------------------------------
Clifford Riggs
CCIE #9314, CISSP
- --------------------------------------------
Proteris Group LLC
Information Security Consultants
Trust. Expertise. Results.
- --------------------------------------------
www.proteris.com
- --------------------------------------------
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (Darwin)
iD8DBQFAaewsJ3mHWY7troQRAo+FAJ9zFrQFZn0t2kFbClC/gYAYPKdXtwCfTuoW
XMgh2Jh98kS9WCLSohCOBgs=
=fiOP
-----END PGP SIGNATURE-----
-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list