Interesting problem while trying to monitor Oracle RAC services
Giorgio Zarrelli
giorgio at zarrelli.org
Mon Mar 30 10:13:24 CEST 2009
Hi,
check the environment of the users launching the script. Which user do you
"use" to launch the script locally? And which one from remote?
Giorgio
Kumar, Ashish (xml.devel at gmail.com) scritto:
>
> Hello,
>
> We are facing an interesting but strange issue while trying to monitor
> Oracle RAC services.
>
> Oracle RAC is running on AIX 5.3 and nagios is running on Fedora Core 9.
>
> The scripts we are using to monitor Oracle RAC services on AIX are as follows
>
> -------------------------
> $ cat check_oracle_services.sh
>
> #!/usr/bin/ksh
> # found on the Internet
> RSC_KEY=$1
>
> /oracle/crs_home/bin/crs_stat -u | awk \
> 'BEGIN { FS="="; state = 0; } \
> $1~/NAME/ && $2~/'$RSC_KEY'/ {appname = $2; state=1}; \
> state == 0 {next;} \
> $1~/TARGET/ && state == 1 {apptarget = $2; state=2;} \
> $1~/STATE/ && state == 2 {appstate = $2; state=3;} \
> state == 3 {printf "%-45s %-18s\n", appname, appstate; state=0;}'
> -------------------------
>
> $ cat check_oracle_services.pl
>
> #!/usr/bin/env perl
>
> use strict;
> use Getopt::Std;
>
> my %return_value = (
> OK => 0,
> CRIT => 2,
> UNKNOWN => 3
> );
>
> my $message = "nagios";
> my $exit_status;
>
> my %opt=();
> getopts("p:h", \%opt);
>
> sub usage(){
> print "Usage: $0 -p service_name\n";
> exit $return_value{'UNKNOWN'};
> }
>
> usage() if defined $opt{'h'};
>
> my $SERVICE = $opt{'p'} if defined $opt{'p'} || usage();
>
> # the following code was added to make sure that nrpe was not getting confused
> # with dotted argument
> if ($SERVICE =~ "foo") {
> $SERVICE = "ora.foo.bar.inst";
> }
>
> my $PIPED = qx/ ksh check_oracle_services.sh $SERVICE/;
> print $PIPED;
>
> if ($PIPED =~ /OFFLINE/g) {
> $exit_status = $return_value{'CRIT'};
> $message = "Critical: $SERVICE is not running.";
> } else {
> $exit_status = $return_value{'OK'};
> $message = "OK: $SERVICE is running.";
> }
>
> print "$message\n";
> exit $exit_status;
> -------------------------
>
> When we try to run this script on AIX (local system) the output is as follows:
>
> [srv01@/home/nagios/nrpe/libexec]$ perl check_oracle_services.pl -p foo
> ora.foo.bar.inst OFFLINE
> Critical: ora.foo.bar.inst is not running.
>
> [srv01@/home/nagios/nrpe/libexec]$ perl check_oracle_services.pl -p
> ora.foo.bar.inst
> ora.foo.bar.inst OFFLINE
> Critical: ora.foo.bar.inst is not running.
>
> The service indeed is offline
>
> [srv01@/home/nagios/nrpe/libexec]$ perl check_oracle_services.pl -p
> ora.foodb.bardb1.inst
> ora.foodb.bardb1.inst ONLINE on srv01
> OK: ora.foodb.bardb1.inst is running.
>
>
> Now when we try to run the same thing from nagios server it shows the
> services are online even if they are not
>
> [root at nagios libexec]# ./check_nrpe -n -H 10.0.10.20 -c
> check_oracle_services -a ora.foo.bar.inst
> OK: ora.foo.bar.inst is running.
>
> [root at nagios libexec]# ./check_nrpe -n -H 10.0.10.20 -c
> check_oracle_services -a foo
> OK: ora.foo.bar.inst is running.
>
> This is strange that we get the correct status when scripts are
> executed locally but wrong status when the scripts are executed
> remotely.
>
> Has anyone faced a similar issue? I would appreciate if someone could
> give some insights on this.
>
> Thanks
>
> ------------------------------------------------------------------------------
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
------------------------------------------------------------------------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list