printing info from nagios db check script
Alex Griffin
agriffin at nagios.com
Mon May 14 17:26:19 CEST 2012
I haven't fully grokked the script you posted, but I do have a few
comments. You should be quoting every single variable which is set by an
external command, unless you have some reason not to. Most unquoted
variables are bugs just waiting to show up in unexpected ways. You
should also quote any variables which are set in other ways if there is
any chance of them containing bad characters like spaces. Personally I
just quote everything to avoid having to think about the issue.
Second, you could try adding 'set -x' to the top of your script to add
debugging output and hopefully help you find problems.
Alex Griffin
---
Tech Team
agriffin at nagios.com
Tim Dunphy wrote:
> hey guys,
>
> this was kind of interesting so I thought I might report it to anyone who might take an interest in this thread. But I just realized where the 'additional' percentage that the tablespace check is reporting comes from.
>
> It comes from the fact that the check is attempting to report two tablespaces and that's why the formatting is falling apart.
>
> I altered the print statement of the loop a little bit and saw the important difference:
>
> while i=$($sqlplus -s -l $USERNAME/$PASS@$SID @/opt/bin/ops/mlb_tablespace.sql | awk '{print $1}') j=$($sqlplus -s -l $USERNAME/$PASS@$SID @/opt/bin/ops/mlb_tablespace.sql | awk '{print $3}') k=$($sqlplus -s -l $USERNAME/$PASS@$SID @/opt/bin/ops/mlb_tablespace.sql | awk '{print $2}') l=$($sqlplus -s -l $USERNAME/$PASS@$SID @/opt/bin/ops/mlb_tablespace.sql | awk '{print $4}')
> do
> TBLSPACE=$i
> PCT=$j
> FREE=$k
> TOTAL=$l
> if [ "$TBLSPACE" ]
> then
> echo "NOK" $TBLSPACE " is at " $PCT $FREE/$TOTAL
> exit 2
> else
> echo "All OK"
> exit 0
> fi
> done
>
> Now outputs:
>
> [db07-dc2:~] root% /opt/nagios/libexec/check_qaecom1_tablespace.sh
> NOK BAM_USER_INDX_LG MLBDATASM is at 97.50% 95.85% 820 340/32764 8191
>
>
> So what I need to try and figure out is how to print additional tablespaces that meet the threshold levels and have the output make sense visually. I will keep working on this, but would welcome any input you might have.
>
>
> Thanks
> Tim
>
> ----- Original Message -----
> From: "Tim Dunphy"<bluethundr at jokefire.com>
> To: nagios-users at lists.sourceforge.net
> Sent: Sunday, May 13, 2012 6:27:36 PM
> Subject: printing info from nagios db check script
>
> Hey list,
>
> I'm having a small but important problem with a script I am writing in order to monitor the tablespaces of an oracle database. This is probably more of a bash programming issue, but as the ultimate purpose of the script is to be a nagios check I am hoping that you won't mind me asking here.
>
>
> Just to give you a brief overview of what I am experiencing, I'd like to start by giving you the output of a couple runs of the script and a couple snippets of code.
>
> First:
>
> [db07-dc2:~] root% /opt/nagios/libexec/check_qaecom1_tablespace.sh
> /opt/nagios/libexec/check_qaecom1_tablespace.sh: line 60: [: BAM_USER_INDX_LG: unary operator expected
> All OK
>
>
> is mainly achieved through this loop:
>
> while i=$($sqlplus -s -l $USERNAME/$PASS@$SID @/opt/bin/ops/mlb_tablespace.sql | awk '{print $1}') j=$($sqlplus -s -l $USERNAME/$PASS@$SID @/opt/bin/ops/mlb_tablespace.sql | awk '{print $3}') k=$($sqlplus -s -l $USERNAME/$PASS@$SID @/opt/bin/ops/mlb_tablespace.sql | awk '{print $2}') l=$($sqlplus -s -l $USERNAME/$PASS@$SID @/opt/bin/ops/mlb_tablespace.sql | awk '{print $4}')
> do
> TBLSPACE=$i
> PCT=$j
> FREE=$k
> TOTAL=$l
> if [ $TBLSPACE ]
> then
> echo "NOK $TBLSPACE is at $PCT $FREE/$TOTAL "
> exit 2
> else
> echo "All OK"
> exit 0
> fi
> done
>
> But as you can see a 'unary operator' error is produced by this code. As you can probably tell, all it does is execute a few sqlplus commands (with the SQL contained in a separate file) and assign them to a few variables. It incorrectly produces an 'OK' state.
>
>
> But if I try to fix the 'unary operator' error by putting the TBLSPACE variable in quotes, the result changes from "ALL OK" to showing some output and (correctly) produces an error state:
>
> [db07-dc2:~] root% /opt/nagios/libexec/check_qaecom1_tablespace.sh
> NOK BAM_USER_INDX_LG
> MLBDATASM is at 97.50%
> 95.85% 820
> 340/32764
> 8191
>
>
>
> while i=$($sqlplus -s -l $USERNAME/$PASS@$SID @/opt/bin/ops/mlb_tablespace.sql | awk '{print $1}') j=$($sqlplus -s -l $USERNAME/$PASS@$SID @/opt/bin/ops/mlb_tablespace.sql | awk '{print $3}') k=$($sqlplus -s -l $USERNAME/$PASS@$SID @/opt/bin/ops/mlb_tablespace.sql | awk '{print $2}') l=$($sqlplus -s -l $USERNAME/$PASS@$SID @/opt/bin/ops/mlb_tablespace.sql | awk '{print $4}')
> do
> TBLSPACE=$i
> PCT=$j
> FREE=$k
> TOTAL=$l
> if [ "$TBLSPACE" ]
> then
> echo "NOK $TBLSPACE is at $PCT $FREE/$TOTAL "
> exit 2
> else
> echo "All OK"
> exit 0
> fi
> done
>
> The SQL is designed to list any tablespaces that grow beyond a certain threshold.
>
>
> My question is this:
>
> [db07-dc2:~] root% /opt/nagios/libexec/check_qaecom1_tablespace.sh
> NOK BAM_USER_INDX_LG #<-- this is correctly output from the script
> MLBDATASM is at 97.50% #<-- this is also correctly output from the scipt
> 95.85% 820 #<-- but where does this additional and different percentage come from?
> 340/32764 #<-- and why are the two variables $FREE/$TOTAL broken up on different lines?
> 8191
>
>
>
> I realize that this may not be an easy question, but I wanted to put this out there in case anyone has faced a similar situation before.
>
>
>
> Here is the entirety of the shell script:
>
>
> #!/bin/bash
>
> # exit codes
> CRED_ERR=1 # if the credentials are not valid
> NOARGS=2 # if the required parameters were not supplied
>
> # credentials / environment variables
> ORACLE_HOME="/u01/app/oracle/product/10.2.0.4"
> ORACLE_SID=qaecom1
> sqlplus="/u01/app/oracle/product/10.2.0.4/bin/sqlplus"
> USERNAME=mlbwatch
> PASS=n3x1ch3q
> SID=${ORACLE_SID}
>
> if [ -z "${USERNAME}" ] || [ -z "${PASS}" ]; # Exit if no arguments were given.
> then
> echo "Error: Username or Password are empty"
> exit $NOARGS
> fi ;
>
> PATH=$PATH:$ORACLE_HOME/bin
> LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$ORACLE_HOME/lib
> export ORACLE_HOME PATH LD_LIBRARY_PATH
>
>
>
> while i=$($sqlplus -s -l $USERNAME/$PASS@$SID @/opt/bin/ops/mlb_tablespace.sql | awk '{print $1}') j=$($sqlplus -s -l $USERNAME/$PASS@$SID @/opt/bin/ops/mlb_tablespace.sql | awk '{print $3}') k=$($sqlplus -s -l $USERNAME/$PASS@$SID @/opt/bin/ops/mlb_tablespace.sql | awk '{print $2}') l=$($sqlplus -s -l $USERNAME/$PASS@$SID @/opt/bin/ops/mlb_tablespace.sql | awk '{print $4}')
> do
> TBLSPACE=$i
> PCT=$j
> FREE=$k
> TOTAL=$l
> if [ "$TBLSPACE" ]
> then
> echo "NOK $TBLSPACE is at $PCT $FREE/$TOTAL "
> exit 2
> else
> echo "All OK"
> exit 0
> fi
> done
>
>
> errorCode=$? # checks if the last operation (sqlplus) was completed successfully or not
> if [ ${errorCode} -ne 0 ]
> then
> echo "Running sqlplus FAILED"
> exit ${CRED_ERR}
> echo
> fi
>
>
> And here's the SQL that the script calls.
>
> --###########################################################################
> --### THIS IS FOR TABLESPACE MONITORING with exclusion of TEMP and UNDO
> --## Tablespace Alert - A tablespace has reached a crital state! #
> --### Checks for different pecentage thresholds by total size of the TS. #
> --### Alert the DBA Group - Page - Phone Service #
> --###########################################################################
> set feedback off
> set pagesize 0
> set trimspool on
>
> SELECT d.tablespace_name "NAME",
> ROUND(NVL(f.bytes, 0)/1024/1024) "FREE(M)",
> TO_CHAR(NVL((a.bytes - NVL(f.bytes, 0)) / a.bytes * 100, 0),'990.00')||'%' "USED %",
> ROUND(NVL(a.bytes, 0)/1024/1024) "TOTAL(M)"
> FROM sys.dba_tablespaces d,
> (SELECT tablespace_name, sum(bytes) bytes
> FROM dba_data_files group by tablespace_name) a,
> (SELECT tablespace_name, sum(bytes) bytes
> FROM dba_free_space group by tablespace_name) f
> WHERE d.tablespace_name = a.tablespace_name(+)
> AND d.tablespace_name = f.tablespace_name(+)
> AND d.tablespace_name != (select VALUE from v$parameter where name
> ='undo_tablespace')
> AND round((a.bytes - F.bytes)/a.bytes * 100)>=
> CASE
> WHEN a.bytes< 10737418240 THEN 90
> WHEN a.bytes>= 10737418240 AND a.bytes< 21474836480 THEN 92
> WHEN a.bytes>= 21474836480 AND a.bytes< 32212254720 THEN 94
> WHEN a.bytes>= 32212254720 AND a.bytes< 42949672960 THEN 96
> WHEN a.bytes>= 42949672960 AND a.bytes< 64424509440 THEN 97
> WHEN a.bytes>= 64424509440 AND a.bytes< 118111600640 THEN 98
> ELSE 99
> END/* */
> ORDER BY 4 desc
> /
> exit
>
>
> Thanks ahead of time for any and all input you might have!
>
> Tim
>
>
>
>
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list