monitor number of open files on linux
Allan Clark
allanc at chickenandporn.com
Fri Jun 8 23:08:30 CEST 2012
On Fri, Jun 8, 2012 at 1:53 PM, Parkman, Mikhail <
Mikhail_Parkman at cable.comcast.com> wrote:
> Thanks - I decided to go with check_open_files.pl
> *
> http://exchange.nagios.org/directory/Plugins/Uncategorized/Operating-Systems/Linux/check-open-files/details
> *<http://exchange.nagios.org/directory/Plugins/Uncategorized/Operating-Systems/Linux/check-open-files/details>
>
> I didn't find help_me/read_me info for this plugin.
> After I installed it on the target box into /usr/local/nagios/libexec and
> just executed it, I got:
> ----------
> [root at target_host libexec]# ./check_open_files.pl
> Usage: -w <warn> -c <crit> [-t <timeout>] [-v version] [-h help]
> [root at target_host libexec]#
> ======
> That told me that I should run it at least with "-w some_value1 -c
> some_value2"
> Then I tried to run it with different -w -c values and I am not clear why
> I am getting different threshold values (bold, red) :
> ===============
> [root@ target_host libexec]# ./check_open_files.pl -w 500 -c 10000
> OK: open files (4590) is below threshold (*16194515/323890300*
> )|open_files=4590;*16194515;323890300*
> [root@ target_host libexec]# ./check_open_files.pl -w 1000 -c 10000
> OK: open files (4590) is below threshold (*32389030/323890300*
> )|open_files=4590;*32389030;323890300*
> [root@ target_host libexec]# ./check_open_files.pl -w 10 -c 100
> OK: open files (4590) is below threshold (*323890/3238903*
> )|open_files=4590;*323890;3238903*
> ===============
> Why do I get in response 2 threshold values and why are they different
> each time I enter another number of warning and critical limits?
>
Clearly, in general terms compared to other plugins:
1) you're getting "OK" because 4590 is less than the thresholds you've set;
had it exceeded 323890 (in the -w10 example) then you'd get WARN, and if it
exceeded the other, an ERROR response. The actual thresholds are returned
back because they are based on a calculation, and when the values are
below, but the suer thinks they shouldn't be, the Nagios/Icinga screen
would show the ref values as well as a comment.
2) your question as to why the numbers change might be more complex than
I'm reading, but it's clearly taking % of total system files as a threshold:
-w 500 --> 500% of (cat /proc/sys/fs/file-max) ==> 16194515
-c 10000 --> 10000% of (cat /proc/sys/fs/file-max) ==> 323890300
Have I misread your question(s)?
I would suggest you set your thresholds to alarm on percentages; I'm not
sure 50% and 80% are good numbers, but "-w 50 -c 80" would achieve those.
Allan
--
allanc at chickenandporn.com "金鱼" http://linkedin.com/in/goldfish
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20120608/acbb3bb6/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list