printing info from nagios db check script
Tim Dunphy
bluethundr at jokefire.com
Mon May 14 00:27:36 CEST 2012
Hey list,
I'm having a small but important problem with a script I am writing in order to monitor the tablespaces of an oracle database. This is probably more of a bash programming issue, but as the ultimate purpose of the script is to be a nagios check I am hoping that you won't mind me asking here.
Just to give you a brief overview of what I am experiencing, I'd like to start by giving you the output of a couple runs of the script and a couple snippets of code.
First:
[db07-dc2:~] root% /opt/nagios/libexec/check_qaecom1_tablespace.sh
/opt/nagios/libexec/check_qaecom1_tablespace.sh: line 60: [: BAM_USER_INDX_LG: unary operator expected
All OK
is mainly achieved through this loop:
while i=$($sqlplus -s -l $USERNAME/$PASS@$SID @/opt/bin/ops/mlb_tablespace.sql | awk '{print $1}') j=$($sqlplus -s -l $USERNAME/$PASS@$SID @/opt/bin/ops/mlb_tablespace.sql | awk '{print $3}') k=$($sqlplus -s -l $USERNAME/$PASS@$SID @/opt/bin/ops/mlb_tablespace.sql | awk '{print $2}') l=$($sqlplus -s -l $USERNAME/$PASS@$SID @/opt/bin/ops/mlb_tablespace.sql | awk '{print $4}')
do
TBLSPACE=$i
PCT=$j
FREE=$k
TOTAL=$l
if [ $TBLSPACE ]
then
echo "NOK $TBLSPACE is at $PCT $FREE/$TOTAL "
exit 2
else
echo "All OK"
exit 0
fi
done
But as you can see a 'unary operator' error is produced by this code. As you can probably tell, all it does is execute a few sqlplus commands (with the SQL contained in a separate file) and assign them to a few variables. It incorrectly produces an 'OK' state.
But if I try to fix the 'unary operator' error by putting the TBLSPACE variable in quotes, the result changes from "ALL OK" to showing some output and (correctly) produces an error state:
[db07-dc2:~] root% /opt/nagios/libexec/check_qaecom1_tablespace.sh
NOK BAM_USER_INDX_LG
MLBDATASM is at 97.50%
95.85% 820
340/32764
8191
while i=$($sqlplus -s -l $USERNAME/$PASS@$SID @/opt/bin/ops/mlb_tablespace.sql | awk '{print $1}') j=$($sqlplus -s -l $USERNAME/$PASS@$SID @/opt/bin/ops/mlb_tablespace.sql | awk '{print $3}') k=$($sqlplus -s -l $USERNAME/$PASS@$SID @/opt/bin/ops/mlb_tablespace.sql | awk '{print $2}') l=$($sqlplus -s -l $USERNAME/$PASS@$SID @/opt/bin/ops/mlb_tablespace.sql | awk '{print $4}')
do
TBLSPACE=$i
PCT=$j
FREE=$k
TOTAL=$l
if [ "$TBLSPACE" ]
then
echo "NOK $TBLSPACE is at $PCT $FREE/$TOTAL "
exit 2
else
echo "All OK"
exit 0
fi
done
The SQL is designed to list any tablespaces that grow beyond a certain threshold.
My question is this:
[db07-dc2:~] root% /opt/nagios/libexec/check_qaecom1_tablespace.sh
NOK BAM_USER_INDX_LG # <-- this is correctly output from the script
MLBDATASM is at 97.50% # <-- this is also correctly output from the scipt
95.85% 820 # <-- but where does this additional and different percentage come from?
340/32764 # <-- and why are the two variables $FREE/$TOTAL broken up on different lines?
8191
I realize that this may not be an easy question, but I wanted to put this out there in case anyone has faced a similar situation before.
Here is the entirety of the shell script:
#!/bin/bash
# exit codes
CRED_ERR=1 # if the credentials are not valid
NOARGS=2 # if the required parameters were not supplied
# credentials / environment variables
ORACLE_HOME="/u01/app/oracle/product/10.2.0.4"
ORACLE_SID=qaecom1
sqlplus="/u01/app/oracle/product/10.2.0.4/bin/sqlplus"
USERNAME=mlbwatch
PASS=n3x1ch3q
SID=${ORACLE_SID}
if [ -z "${USERNAME}" ] || [ -z "${PASS}" ]; # Exit if no arguments were given.
then
echo "Error: Username or Password are empty"
exit $NOARGS
fi ;
PATH=$PATH:$ORACLE_HOME/bin
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$ORACLE_HOME/lib
export ORACLE_HOME PATH LD_LIBRARY_PATH
while i=$($sqlplus -s -l $USERNAME/$PASS@$SID @/opt/bin/ops/mlb_tablespace.sql | awk '{print $1}') j=$($sqlplus -s -l $USERNAME/$PASS@$SID @/opt/bin/ops/mlb_tablespace.sql | awk '{print $3}') k=$($sqlplus -s -l $USERNAME/$PASS@$SID @/opt/bin/ops/mlb_tablespace.sql | awk '{print $2}') l=$($sqlplus -s -l $USERNAME/$PASS@$SID @/opt/bin/ops/mlb_tablespace.sql | awk '{print $4}')
do
TBLSPACE=$i
PCT=$j
FREE=$k
TOTAL=$l
if [ "$TBLSPACE" ]
then
echo "NOK $TBLSPACE is at $PCT $FREE/$TOTAL "
exit 2
else
echo "All OK"
exit 0
fi
done
errorCode=$? # checks if the last operation (sqlplus) was completed successfully or not
if [ ${errorCode} -ne 0 ]
then
echo "Running sqlplus FAILED"
exit ${CRED_ERR}
echo
fi
And here's the SQL that the script calls.
--###########################################################################
--### THIS IS FOR TABLESPACE MONITORING with exclusion of TEMP and UNDO
--## Tablespace Alert - A tablespace has reached a crital state! #
--### Checks for different pecentage thresholds by total size of the TS. #
--### Alert the DBA Group - Page - Phone Service #
--###########################################################################
set feedback off
set pagesize 0
set trimspool on
SELECT d.tablespace_name "NAME",
ROUND(NVL(f.bytes, 0)/1024/1024) "FREE(M)",
TO_CHAR(NVL((a.bytes - NVL(f.bytes, 0)) / a.bytes * 100, 0),'990.00')||'%' "USED %",
ROUND(NVL(a.bytes, 0)/1024/1024) "TOTAL(M)"
FROM sys.dba_tablespaces d,
(SELECT tablespace_name, sum(bytes) bytes
FROM dba_data_files group by tablespace_name) a,
(SELECT tablespace_name, sum(bytes) bytes
FROM dba_free_space group by tablespace_name) f
WHERE d.tablespace_name = a.tablespace_name(+)
AND d.tablespace_name = f.tablespace_name(+)
AND d.tablespace_name != (select VALUE from v$parameter where name
='undo_tablespace')
AND round((a.bytes - F.bytes)/a.bytes * 100) >=
CASE
WHEN a.bytes < 10737418240 THEN 90
WHEN a.bytes >= 10737418240 AND a.bytes < 21474836480 THEN 92
WHEN a.bytes >= 21474836480 AND a.bytes < 32212254720 THEN 94
WHEN a.bytes >= 32212254720 AND a.bytes < 42949672960 THEN 96
WHEN a.bytes >= 42949672960 AND a.bytes < 64424509440 THEN 97
WHEN a.bytes >= 64424509440 AND a.bytes < 118111600640 THEN 98
ELSE 99
END/* */
ORDER BY 4 desc
/
exit
Thanks ahead of time for any and all input you might have!
Tim
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list