serious performance issue

Surajit Mukherjee Surajit.Mukherjee at ness.com
Fri Apr 10 06:56:44 CEST 2009


Hi All,

 

Even I am also facing the same kind of issue. I am using Nagios 3.0.6
and Redhat 5 OS.

 

I am not getting archive logs in the notification area and it says
Error: Cannot open log file
'/usr/local/nagios/var/archives/nagios-04-10-2009-00.log' for reading!

 

Please help.

 

Surajit 

________________________________

From: shadih rahman [mailto:shadhin71 at gmail.com] 
Sent: Thursday, April 09, 2009 7:25 PM
To: fancyrabbit
Cc: nagios-users at lists.sourceforge.net
Subject: Re: [Nagios-users] serious performance issue

 

Now my nagios is not running any check at all.  I get a lot of "looks
like it was orphaned" message and then nagios just sit there.  Can
someone help me with this.  I will add some entries from nagios.debug
and  nagios.log along with my nagios.cfg.  Thanks in advance.




nagios.debug: 

[1239284464.560241] [016.2] [pid=15690] Found another host check event
for this 
host @ Thu Apr  9 08:59:56 2009
[1239284464.560248] [016.2] [pid=15690] New host check event occurs
after the ex
isting event, so we'll ignore it.
[1239284464.560253] [016.2] [pid=15690] Keeping original host check
event (ignor
ing the new one).
[1239284464.560261] [016.1] [pid=15690] ** Async check result for host
'iab323pc
20.atg.columbia.edu' handled: new state=0



nagios.log:


[1239254607] Warning: The check of host 'et251pc70.atg.columbia.edu'
looks like 
it was orphaned (results never came back).  I'm scheduling an immediate
check of
 the host...
[1239254607] Warning: The check of host 'et251pc71.atg.columbia.edu'
looks like 
it was orphaned (results never came back).  I'm scheduling an immediate
check of
 the host...
[1239254607] Warning: The check of host 'et251pc72.atg.columbia.edu'
looks like 
it was orphaned (results never came back).  I'm scheduling an immediate
check of
 the host...


nagiostats:

Nagios Stats 3.0.6
Copyright (c) 2003-2008 Ethan Galstad (www.nagios.org)
Last Modified: 12-01-2008
License: GPL

CURRENT STATUS DATA
------------------------------------------------------
Status File:                            /var/log/nagios/status.dat
Status File Age:                        0d 0h 0m 4s
Status File Version:                    3.0.6

Program Running Time:                   0d 15h 37m 5s
Nagios PID:                             15690
Used/High/Total Command Buffers:        0 / 1 / 4096

Total Services:                         2783
Services Checked:                       2783
Services Scheduled:                     2782
Services Actively Checked:              2783
Services Passively Checked:             0
Total Service State Change:             0.000 / 38.820 / 0.328 %
Active Service Latency:                 244.062 / 37353.761 / 22185.948
sec
Active Service Execution Time:          0.010 / 15.072 / 0.293 sec
Active Service State Change:            0.000 / 38.820 / 0.328 %
Active Services Last 1/5/15/60 min:     0 / 0 / 0 / 0
Passive Service Latency:                0.000 / 0.000 / 0.000 sec
Passive Service State Change:           0.000 / 0.000 / 0.000 %
Passive Services Last 1/5/15/60 min:    0 / 0 / 0 / 0
Services Ok/Warn/Unk/Crit:              2571 / 14 / 143 / 55
Services Flapping:                      19
Services In Downtime:                   0

Total Hosts:                            3037
Hosts Checked:                          3005
Hosts Scheduled:                        3030
Hosts Actively Checked:                 3037
Host Passively Checked:                 0
Total Host State Change:                0.000 / 57.170 / 0.448 %
Active Host Latency:                    0.000 / 36712.008 / 19785.947
sec
Active Host Execution Time:             0.000 / 30.011 / 1.589 sec
Active Host State Change:               0.000 / 57.170 / 0.448 %
Active Hosts Last 1/5/15/60 min:        0 / 0 / 0 / 299
Passive Host Latency:                   0.000 / 0.000 / 0.000 sec
Passive Host State Change:              0.000 / 0.000 / 0.000 %
Passive Hosts Last 1/5/15/60 min:       0 / 0 / 0 / 0
Hosts Up/Down/Unreach:                  2854 / 183 / 0
Hosts Flapping:                         16
Hosts In Downtime:                      0

Active Host Checks Last 1/5/15 min:     0 / 0 / 0
   Scheduled:                           0 / 0 / 0
   On-demand:                           0 / 0 / 0
   Parallel:                            0 / 0 / 0
   Serial:                              0 / 0 / 0
   Cached:                              0 / 0 / 0
Passive Host Checks Last 1/5/15 min:    0 / 0 / 0
Active Service Checks Last 1/5/15 min:  0 / 0 / 0
   Scheduled:                           0 / 0 / 0
   On-demand:                           0 / 0 / 0
   Cached:                              0 / 0 / 0
Passive Service Checks Last 1/5/15 min: 0 / 0 / 0

External Commands Last 1/5/15 min:      0 / 0 / 0



nagios.cfg:

log_file=/var/log/nagios/nagios.log
cfg_file=/etc/nagios/commands.cfg
cfg_file=/etc/nagios/contacts.cfg
cfg_file=/etc/nagios/timeperiods.cfg
cfg_file=/etc/nagios/templates.cfg
cfg_dir=/etc/nagios/hosts
cfg_dir=/etc/nagios/services
object_cache_file=/var/log/nagios/objects.cache
precached_object_file=/var/log/nagios/objects.precache
resource_file=/etc/nagios/resource.cfg
status_file=/var/log/nagios/status.dat
status_update_interval=60
nagios_user=nagios
nagios_group=nagios
check_external_commands=1
command_check_interval=-1
command_file=/var/log/nagios/rw/nagios.cmd
external_command_buffer_slots=4096
lock_file=/var/log/nagios/nagios.lock
temp_file=/var/log/nagios/nagios.tmp
temp_path=/tmp
event_broker_options=8
broker_module=/usr/lib64/nagios/ndomod.o
config_file=/etc/nagios/ndomod.cfg
log_rotation_method=m
log_archive_path=/var/log/nagios/archives
use_syslog=1
log_notifications=1
log_service_retries=1
log_host_retries=1
log_event_handlers=1
log_initial_states=0
log_external_commands=1
log_passive_checks=1
service_inter_check_delay_method=s
max_service_check_spread=30
service_interleave_factor=s
host_inter_check_delay_method=s
max_host_check_spread=30
max_concurrent_checks=0
check_result_reaper_frequency=10
max_check_result_reaper_time=20
check_result_path=/var/log/nagios/spool/checkresults
max_check_result_file_age=3600
cached_host_check_horizon=15
cached_service_check_horizon=15
enable_predictive_host_dependency_checks=1
enable_predictive_service_dependency_checks=1
soft_state_dependencies=0
auto_reschedule_checks=0
auto_rescheduling_interval=30
auto_rescheduling_window=180
sleep_time=0.25
service_check_timeout=60
host_check_timeout=30
event_handler_timeout=30
notification_timeout=60
ocsp_timeout=5
perfdata_timeout=5
retain_state_information=1
state_retention_file=var/log/nagios/retention.dat
retention_update_interval=60
use_retained_program_state=1
use_retained_scheduling_info=1
retained_host_attribute_mask=0
retained_service_attribute_mask=0
retained_process_host_attribute_mask=0
retained_process_service_attribute_mask=0
retained_contact_host_attribute_mask=0
retained_contact_service_attribute_mask=0
interval_length=60
use_aggressive_host_checking=0
execute_service_checks=1
accept_passive_service_checks=1
execute_host_checks=1
accept_passive_host_checks=1
enable_notifications=1
enable_event_handlers=1
process_performance_data=0
translate_passive_host_checks=0
passive_host_checks_are_soft=0
check_for_orphaned_services=1
check_for_orphaned_hosts=1
check_service_freshness=1
service_freshness_check_interval=60
check_host_freshness=0
host_freshness_check_interval=60
additional_freshness_latency=15
enable_flap_detection=1
low_service_flap_threshold=5.0
high_service_flap_threshold=20.0
low_host_flap_threshold=5.0
high_host_flap_threshold=20.0
date_format=us
enable_embedded_perl=0
use_embedded_perl_implicitly=0
illegal_object_name_chars=`~!$%^&*|'"<>?,()=
illegal_macro_output_chars=`~$&|'"<>
use_regexp_matching=0
use_true_regexp_matching=0
admin_email=sr2690 at columbia.edu
daemon_dumps_core=0
use_large_installation_tweaks=1
enable_environment_macros=1
debug_level=-1
debug_verbosity=2
debug_file=/var/log/nagios/nagios.debug
max_debug_file_size=1000000



On Wed, Apr 8, 2009 at 1:56 AM, fancyrabbit <fancyrabbit at gmail.com>
wrote:

i met almost the same issue.

after tweaking enable_embedded_perl=0, the load average was brought up
but latencies became lower.

On Wed, Apr 8, 2009 at 11:54 AM, shadih rahman <shadhin71 at gmail.com>
wrote:

	I am seeing a ton of orphaned error message for both services
and hosts.  I am running nagios on a quad core 2.2 GHZ machine running 4
GHZ memory.  I will paste my configuration file below.  I have the
machine sending ndo to a local database sitting on a 170 GB Hard drive.
nagios is obcessing on both host and services and sending data to a
machine with identical configuration.  I am doing failover using NSCA.
Please advise on this.
	
	
	
	
	
	nagios.cfg
	
	
	
	log_file=/var/log/nagios/nagios.log
	cfg_file=/etc/nagios/commands.cfg
	cfg_file=/etc/nagios/contacts.cfg
	cfg_file=/etc/nagios/timeperiods.cfg
	cfg_file=/etc/nagios/templates.cfg
	cfg_dir=/etc/nagios/hosts
	cfg_dir=/etc/nagios/services
	object_cache_file=/var/log/nagios/objects.cache
	precached_object_file=/var/log/nagios/objects.precache
	resource_file=/etc/nagios/resource.cfg
	status_file=/var/log/nagios/status.dat
	status_update_interval=60
	nagios_user=nagios
	nagios_group=nagios
	check_external_commands=1
	command_check_interval=-1
	command_file=/var/log/nagios/rw/nagios.cmd
	external_command_buffer_slots=8192
	lock_file=/var/log/nagios/nagios.lock
	temp_file=/var/log/nagios/nagios.tmp
	temp_path=/tmp
	event_broker_options=8
	broker_module=/usr/lib64/nagios/ndomod.o
config_file=/etc/nagios/ndomod.cfg
	log_rotation_method=m
	log_archive_path=/var/log/nagios/archives
	use_syslog=1
	log_notifications=1
	log_service_retries=1
	log_host_retries=1
	log_event_handlers=1
	log_initial_states=0
	log_external_commands=1
	log_passive_checks=1
	service_inter_check_delay_method=n
	max_service_check_spread=30
	service_interleave_factor=s
	host_inter_check_delay_method=s
	max_host_check_spread=30
	max_concurrent_checks=0
	check_result_reaper_frequency=2
	max_check_result_reaper_time=10
	check_result_path=/var/log/nagios/spool/checkresults
	max_check_result_file_age=3600
	cached_host_check_horizon=15
	cached_service_check_horizon=15
	enable_predictive_host_dependency_checks=1
	enable_predictive_service_dependency_checks=1
	soft_state_dependencies=1
	auto_reschedule_checks=1
	auto_rescheduling_interval=30
	auto_rescheduling_window=180
	sleep_time=0.25
	service_check_timeout=30
	host_check_timeout=20
	
	event_handler_timeout=30
	notification_timeout=60
	ocsp_timeout=5
	perfdata_timeout=5
	retain_state_information=1
	state_retention_file=var/log/nagios/retention.dat
	retention_update_interval=60
	use_retained_program_state=1
	use_retained_scheduling_info=1
	retained_host_attribute_mask=0
	retained_service_attribute_mask=0
	retained_process_host_attribute_mask=0
	retained_process_service_attribute_mask=0
	retained_contact_host_attribute_mask=0
	retained_contact_service_attribute_mask=0
	interval_length=60
	use_aggressive_host_checking=0
	execute_service_checks=1
	accept_passive_service_checks=1
	execute_host_checks=1
	accept_passive_host_checks=1
	enable_notifications=1
	enable_event_handlers=1
	process_performance_data=0
	obsess_over_services=1
	ocsp_command=send_service_check
	ochp_command=send_host_check
	obsess_over_hosts=1
	translate_passive_host_checks=0
	passive_host_checks_are_soft=0
	check_for_orphaned_services=1
	check_for_orphaned_hosts=1
	check_service_freshness=1
	service_freshness_check_interval=60
	check_host_freshness=0
	host_freshness_check_interval=60
	additional_freshness_latency=15
	enable_flap_detection=1
	low_service_flap_threshold=5.0
	high_service_flap_threshold=20.0
	low_host_flap_threshold=5.0
	high_host_flap_threshold=20.0
	date_format=us
	enable_embedded_perl=1
	use_embedded_perl_implicitly=1
	illegal_object_name_chars=`~!$%^&*|'"<>?,()=
	illegal_macro_output_chars=`~$&|'"<>
	use_regexp_matching=0
	use_true_regexp_matching=0
	admin_email=sr2690 at columbia.edu
	daemon_dumps_core=0
	use_large_installation_tweaks=1
	enable_environment_macros=1
	debug_level=-1debug_verbosity=2
	debug_file=/var/log/nagios/nagios.debug
	max_debug_file_size=1000000
	
	
	
	
	my nagiostats output
	
	
	
	
	
	
	
	[sr2690>nagiostats
	
	Nagios Stats 3.0.6
	Copyright (c) 2003-2008 Ethan Galstad (www.nagios.org
<http://www.nagios.org/> )
	Last Modified: 12-01-2008
	License: GPL
	
	CURRENT STATUS DATA
	------------------------------------------------------
	Status File:
/var/log/nagios/status.dat
	Status File Age:                        0d 0h 0m 19s
	Status File Version:                    3.0.6
	
	Program Running Time:                   0d 2h 5m 28s
	Nagios PID:                             12139
	Used/High/Total Command Buffers:        0 / 0 / 8192
	
	Total Services:                         2783
	Services Checked:                       2783
	Services Scheduled:                     2782
	Services Actively Checked:              2783
	Services Passively Checked:             0
	Total Service State Change:             0.000 / 52.830 / 0.263 %
	Active Service Latency:                 1.304 / 12092.843 /
1469.130 sec
	Active Service Execution Time:          0.011 / 15.103 / 0.468
sec
	Active Service State Change:            0.000 / 52.830 / 0.263 %
	Active Services Last 1/5/15/60 min:     0 / 0 / 0 / 129
	Passive Service Latency:                0.000 / 0.000 / 0.000
sec
	Passive Service State Change:           0.000 / 0.000 / 0.000 %
	Passive Services Last 1/5/15/60 min:    0 / 0 / 0 / 0
	Services Ok/Warn/Unk/Crit:              2560 / 13 / 186 / 24
	Services Flapping:                      17
	Services In Downtime:                   0
	
	Total Hosts:                            3037
	Hosts Checked:                          3005
	Hosts Scheduled:                        3029
	Hosts Actively Checked:                 3037
	Host Passively Checked:                 0
	Total Host State Change:                0.000 / 53.620 / 0.227 %
	Active Host Latency:                    0.000 / 12080.792 /
3770.409 sec
	Active Host Execution Time:             0.000 / 104.093 / 2.500
sec
	Active Host State Change:               0.000 / 53.620 / 0.227 %
	Active Hosts Last 1/5/15/60 min:        0 / 0 / 0 / 256
	Passive Host Latency:                   0.000 / 0.000 / 0.000
sec
	Passive Host State Change:              0.000 / 0.000 / 0.000 %
	Passive Hosts Last 1/5/15/60 min:       0 / 0 / 0 / 0
	Hosts Up/Down/Unreach:                  2849 / 188 / 0
	Hosts Flapping:                         10
	Hosts In Downtime:                      0
	
	Active Host Checks Last 1/5/15 min:     0 / 0 / 1
	   Scheduled:                           0 / 0 / 0
	   On-demand:                           0 / 0 / 1
	   Parallel:                            0 / 0 / 0
	   Serial:                              0 / 0 / 0
	   Cached:                              0 / 0 / 1
	Passive Host Checks Last 1/5/15 min:    0 / 0 / 0
	Active Service Checks Last 1/5/15 min:  0 / 0 / 0
	   Scheduled:                           0 / 0 / 0
	   On-demand:                           0 / 0 / 0
	   Cached:                              0 / 0 / 0
	Passive Service Checks Last 1/5/15 min: 0 / 0 / 0
	
	External Commands Last 1/5/15 min:      0 / 0 / 0
	
	
	
	
	
	
	-- 
	Cordially,
	Shadhin Rahman

	
------------------------------------------------------------------------
------
	This SF.net email is sponsored by:
	High Quality Requirements in a Collaborative Environment.
	Download a free trial of Rational Requirements Composer Now!
	http://p.sf.net/sfu/www-ibm-com
	_______________________________________________
	Nagios-users mailing list
	Nagios-users at lists.sourceforge.net
	https://lists.sourceforge.net/lists/listinfo/nagios-users
	::: Please include Nagios version, plugin version (-v) and OS
when reporting any issue.
	::: Messages without supporting info will risk being sent to
/dev/null



------------------------------------------------------------------------
------
This SF.net email is sponsored by:
High Quality Requirements in a Collaborative Environment.
Download a free trial of Rational Requirements Composer Now!
http://p.sf.net/sfu/www-ibm-com
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when
reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null




-- 
Cordially,
Shadhin Rahman









The information contained in this communication is intended solely for
the use of the individual or entity to whom it is addressed and others 
authorized to receive it.   It may contain confidential or legally 
privileged information.   If you are not the intended recipient you are 
hereby notified that any disclosure, copying, distribution or taking any 
action in reliance on the contents of this information is strictly prohibited 
and may be unlawful. If you have received this communication in error, 
please notify us immediately by forwarding this email to 
MailAdmin at ness.com and then delete it from your system.

Ness technologies is neither liable for the proper and complete 
transmission of the information contained in this communication nor for 
any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20090410/1926677d/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
This SF.net email is sponsored by:
High Quality Requirements in a Collaborative Environment.
Download a free trial of Rational Requirements Composer Now!
http://p.sf.net/sfu/www-ibm-com
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


More information about the Users mailing list