process_check_result_file does not delete files
Hendrik Baecker
andurin at process-zero.de
Sun Jun 17 19:41:34 CEST 2007
Hi List,
Florian was right. I think there is a smal typo while generating check
checkresult files and specially the "ok-to-go" files for each
checkresult file.
Without that "ok-to-go" file no checkresult file will be go back to the
core for processing, I think. Correct me if I am wrong.
This patch against the current cvs code should fix this and creates a
*.ok file for each checkresult file.
See this ls -la of my spooldir:
-rw------- 1 nagios nagios 438 2007-06-17 19:33 cDGnX9b
--w---S--T 1 nagios nagios 0 2007-06-17 19:33 cDGnX9b.ok
(Can someone explain me this curios file modes?)
Here's the patch.
Kind regards,
Hendrik
Florian Gleixner schrieb:
> Hi,
>
> i have one environment for testing things and there i have only acive
> checks against the local host. On a staging environment i have both
> active and passive checks. There i added some more debug output. Heres a
> part of the log:
>
>
> [1182022126.030951:008.0] ** Timed Event ** Type: 5, Run Time: Sat Jun
> 16 21:28:46 2007
> [1182022126.030964:008.0] ** Check Result Reaper
> [1182022126.030976:001.0] reap_check_results() start
> [1182022126.030989:016.0] Starting to reap check results.
> [1182022126.031002:001.0] process_check_result_queue() start
> [1182022128.556302:001.0]
> process_check_result_file(/usr/local/nagios/var/spool/checkresults/c03yc9w)
> start
> [1182022128.556491:001.0] Exit function process_check_result_file
> [1182022129.616721:001.0]
> process_check_result_file(/usr/local/nagios/var/spool/checkresults/c0dyBOx)
> start
> [1182022129.616906:001.0] Exit function process_check_result_file
> [1182022130.134757:001.0]
> process_check_result_file(/usr/local/nagios/var/spool/checkresults/c790SZO)
> start
> [1182022130.134955:001.0] Exit function process_check_result_file
> [1182022130.241383:001.0]
> process_check_result_file(/usr/local/nagios/var/spool/checkresults/c3hDsZp)
> start
> [1182022130.241565:001.0] Exit function process_check_result_file
> [1182022132.096308:001.0]
> process_check_result_file(/usr/local/nagios/var/spool/checkresults/cEQBt7g)
> start
> [1182022132.096475:001.0] Exit function process_check_result_file
> [1182022133.608091:001.0]
> process_check_result_file(/usr/local/nagios/var/spool/checkresults/cM3Yy1C)
> start
> [1182022133.608265:001.0] Exit function process_check_result_file
> [1182022134.142609:001.0]
> process_check_result_file(/usr/local/nagios/var/spool/checkresults/cJm696X)
> start
> [1182022134.142789:001.0] Exit function process_check_result_file
> [1182022134.419045:001.0]
> process_check_result_file(/usr/local/nagios/var/spool/checkresults/cJv8bsK)
> start
> [1182022134.419223:001.0] Exit function process_check_result_file
> [1182022135.395489:001.0]
> process_check_result_file(/usr/local/nagios/var/spool/checkresults/cOSVLKp)
> start
> [1182022135.395674:001.0] Exit function process_check_result_file
> [1182022136.425076:001.0]
> process_check_result_file(/usr/local/nagios/var/spool/checkresults/cOxyBYw)
> start
> [1182022136.425257:001.0] Exit function process_check_result_file
> [1182022137.468602:001.0]
> process_check_result_file(/usr/local/nagios/var/spool/checkresults/cUx4Gyd)
> start
> [1182022137.468776:001.0] Exit function process_check_result_file
> [1182022138.597791:001.0]
> process_check_result_file(/usr/local/nagios/var/spool/checkresults/co5w52A)
> start
> [1182022138.597977:001.0] Exit function process_check_result_file
> [1182022138.612402:001.0]
> process_check_result_file(/usr/local/nagios/var/spool/checkresults/cnG6TC0)
> start
> [1182022138.612575:001.0] Exit function process_check_result_file
> [1182022138.849846:001.0] process_check_result_queue() end
>
>
>
> Heres what i assed to base/utils.c:
>
> diff -u nagios-cvs_patched/base/utils.c nagios-cvs/base/utils.c
> --- nagios-cvs_patched/base/utils.c 2007-06-15 16:44:25.000000000 +0200
> +++ nagios-cvs/base/utils.c 2007-05-30 23:41:08.000000000 +0200
> @@ -4018,13 +4018,11 @@
> char *temp_buffer=NULL;
> int result=OK;
>
> - log_debug_info(DEBUGL_FUNCTIONS,0,"process_check_result_queue()
> start\n");
> /* open the directory for reading */
> dirp=opendir(dirname);
> if(dirp==NULL){
> asprintf(&temp_buffer,"Error: Could not open check
> result queue directory '%s' for reading.\n",dirname);
>
> write_to_logs_and_console(temp_buffer,NSLOG_CONFIG_ERROR,TRUE);
> - log_debug_info(DEBUGL_FUNCTIONS,0,temp_buffer);
> my_free((void **)&temp_buffer);
> return ERROR;
> }
> @@ -4073,7 +4071,6 @@
>
> closedir(dirp);
>
> - log_debug_info(DEBUGL_FUNCTIONS,0,"process_check_result_queue()
> end\n");
> return result;
>
> }
> @@ -4093,10 +4090,6 @@
> time_t current_time;
> check_result *new_cr=NULL;
>
> - asprintf(&temp_buffer,"process_check_result_file(%s)
> start\n",fname);
> - log_debug_info(DEBUGL_FUNCTIONS,0,temp_buffer);
> - my_free((void **)&temp_buffer);
> -
> if(fname==NULL)
> return ERROR;
>
> @@ -4107,7 +4100,6 @@
>
> /* try removing the file - zero length files can't be
> mmap()'ed, so it might exist */
> unlink(fname);
> - log_debug_info(DEBUGL_FUNCTIONS,0,"problem mmapping
> file. Exit function\n");
>
> return ERROR;
> }
> @@ -4251,13 +4243,11 @@
> /* other (current) files are deleted later (when results are
> processed) */
> if(delete_file==TRUE){
> unlink(fname);
> - log_debug_info(DEBUGL_FUNCTIONS,0,"Deleting cache file\n");
> asprintf(&temp_buffer,"%s.ok",fname);
> unlink(temp_buffer);
> my_free((void **)&temp_buffer);
> }
>
> - log_debug_info(DEBUGL_FUNCTIONS,0,"Exit function
> process_check_result_file\n");
> return OK;
> }
>
> As you can see, i don't get "Delete cache file" in the debug log. A grep
> in the log also don't bring a "Delete cache file". Heres a cache file:
>
> ### Active Check Result File ###
> file_time=1181939515
>
> ### Nagios Service Check Result ###
> # Time: Fri Jun 15 22:31:55 2007
> host_name=localhost
> service_description=Swap Usage
> check_type=0
> scheduled_check=1
> reschedule_check=1
> latency=0.142000
> start_time=1181939515.142709
> finish_time=1181939515.154620
> early_timeout=0
> exited_ok=1
> return_code=0
> output=SWAP OK - 100% free (964 MB out of 964 MB) |swap=964MB;0;0;0;964\n
>
>
> I think the files should always deleted in process_check_result_file. Or
> there should be a timed event that cleans the cache dir sometimes.
>
> Flo
>
>
> Hendrik Bäcker schrieb:
>> Hi Florian,
>>
>> you are right!
>>
>> Nagios should delete the files after processing them.
>>
>> Could you please tell us more about your installation, like using only
>> active, active and passive service checks and so on.
>>
>> Please have also a look at your nagios.log if it told something about
>> passive checks for non-existing hosts or services (for this enable the
>> logging of external commands and so on - all that you can find for logging).
>>
>> Further it would be nice if you can send one or two of your check result
>> files so that we can have a look at it.
>>
>> Kind regards
>> Hendrik
>>
>> Florian Gleixner schrieb:
>>> Hi,
>>> my check result cache directory contains many files. Nagios 3 (CVS last
>>> week) generates more and more files and does not delete them. I think it
>>> should delete them in process_check_result_file(). At the moment it
>>> deletes the files only if they are too old, but it should delete them
>>> too if nagios has parsed them successfully i think - or nagios should
>>> check the directory from time to time and delete old files.
>>> Flo
>>> -------------------------------------------------------------------------
>>> This SF.net email is sponsored by DB2 Express
>>> Download DB2 Express C - the FREE version of DB2 express and take
>>> control of your XML. No limits. Just data. Click to get it now.
>>> http://sourceforge.net/powerbar/db2/
>>> _______________________________________________
>>> Nagios-devel mailing list
>>> Nagios-devel at lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/nagios-devel
>>
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by DB2 Express
> Download DB2 Express C - the FREE version of DB2 express and take
> control of your XML. No limits. Just data. Click to get it now.
> http://sourceforge.net/powerbar/db2/
> _______________________________________________
> Nagios-devel mailing list
> Nagios-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-devel
>
>
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by DB2 Express
> Download DB2 Express C - the FREE version of DB2 express and take
> control of your XML. No limits. Just data. Click to get it now.
> http://sourceforge.net/powerbar/db2/
> _______________________________________________
> Nagios-devel mailing list
> Nagios-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-devel
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: utils.c.patch
Type: text/x-patch
Size: 442 bytes
Desc: not available
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20070617/eaedcb27/attachment.bin>
-------------- next part --------------
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
-------------- next part --------------
_______________________________________________
Nagios-devel mailing list
Nagios-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-devel
More information about the Developers
mailing list