serious performance issue
shadih rahman
shadhin71 at gmail.com
Thu Apr 9 15:55:08 CEST 2009
Now my nagios is not running any check at all. I get a lot of "looks like
it was orphaned" message and then nagios just sit there. Can someone help
me with this. I will add some entries from nagios.debug and nagios.log
along with my nagios.cfg. Thanks in advance.
nagios.debug:
[1239284464.560241] [016.2] [pid=15690] Found another host check event for
this
host @ Thu Apr 9 08:59:56 2009
[1239284464.560248] [016.2] [pid=15690] New host check event occurs after
the ex
isting event, so we'll ignore it.
[1239284464.560253] [016.2] [pid=15690] Keeping original host check event
(ignor
ing the new one).
[1239284464.560261] [016.1] [pid=15690] ** Async check result for host
'iab323pc
20.atg.columbia.edu' handled: new state=0
nagios.log:
[1239254607] Warning: The check of host 'et251pc70.atg.columbia.edu' looks
like
it was orphaned (results never came back). I'm scheduling an immediate
check of
the host...
[1239254607] Warning: The check of host 'et251pc71.atg.columbia.edu' looks
like
it was orphaned (results never came back). I'm scheduling an immediate
check of
the host...
[1239254607] Warning: The check of host 'et251pc72.atg.columbia.edu' looks
like
it was orphaned (results never came back). I'm scheduling an immediate
check of
the host...
nagiostats:
Nagios Stats 3.0.6
Copyright (c) 2003-2008 Ethan Galstad (www.nagios.org)
Last Modified: 12-01-2008
License: GPL
CURRENT STATUS DATA
------------------------------------------------------
Status File: /var/log/nagios/status.dat
Status File Age: 0d 0h 0m 4s
Status File Version: 3.0.6
Program Running Time: 0d 15h 37m 5s
Nagios PID: 15690
Used/High/Total Command Buffers: 0 / 1 / 4096
Total Services: 2783
Services Checked: 2783
Services Scheduled: 2782
Services Actively Checked: 2783
Services Passively Checked: 0
Total Service State Change: 0.000 / 38.820 / 0.328 %
Active Service Latency: 244.062 / 37353.761 / 22185.948 sec
Active Service Execution Time: 0.010 / 15.072 / 0.293 sec
Active Service State Change: 0.000 / 38.820 / 0.328 %
Active Services Last 1/5/15/60 min: 0 / 0 / 0 / 0
Passive Service Latency: 0.000 / 0.000 / 0.000 sec
Passive Service State Change: 0.000 / 0.000 / 0.000 %
Passive Services Last 1/5/15/60 min: 0 / 0 / 0 / 0
Services Ok/Warn/Unk/Crit: 2571 / 14 / 143 / 55
Services Flapping: 19
Services In Downtime: 0
Total Hosts: 3037
Hosts Checked: 3005
Hosts Scheduled: 3030
Hosts Actively Checked: 3037
Host Passively Checked: 0
Total Host State Change: 0.000 / 57.170 / 0.448 %
Active Host Latency: 0.000 / 36712.008 / 19785.947 sec
Active Host Execution Time: 0.000 / 30.011 / 1.589 sec
Active Host State Change: 0.000 / 57.170 / 0.448 %
Active Hosts Last 1/5/15/60 min: 0 / 0 / 0 / 299
Passive Host Latency: 0.000 / 0.000 / 0.000 sec
Passive Host State Change: 0.000 / 0.000 / 0.000 %
Passive Hosts Last 1/5/15/60 min: 0 / 0 / 0 / 0
Hosts Up/Down/Unreach: 2854 / 183 / 0
Hosts Flapping: 16
Hosts In Downtime: 0
Active Host Checks Last 1/5/15 min: 0 / 0 / 0
Scheduled: 0 / 0 / 0
On-demand: 0 / 0 / 0
Parallel: 0 / 0 / 0
Serial: 0 / 0 / 0
Cached: 0 / 0 / 0
Passive Host Checks Last 1/5/15 min: 0 / 0 / 0
Active Service Checks Last 1/5/15 min: 0 / 0 / 0
Scheduled: 0 / 0 / 0
On-demand: 0 / 0 / 0
Cached: 0 / 0 / 0
Passive Service Checks Last 1/5/15 min: 0 / 0 / 0
External Commands Last 1/5/15 min: 0 / 0 / 0
nagios.cfg:
log_file=/var/log/nagios/nagios.log
cfg_file=/etc/nagios/commands.cfg
cfg_file=/etc/nagios/contacts.cfg
cfg_file=/etc/nagios/timeperiods.cfg
cfg_file=/etc/nagios/templates.cfg
cfg_dir=/etc/nagios/hosts
cfg_dir=/etc/nagios/services
object_cache_file=/var/log/nagios/objects.cache
precached_object_file=/var/log/nagios/objects.precache
resource_file=/etc/nagios/resource.cfg
status_file=/var/log/nagios/status.dat
status_update_interval=60
nagios_user=nagios
nagios_group=nagios
check_external_commands=1
command_check_interval=-1
command_file=/var/log/nagios/rw/nagios.cmd
external_command_buffer_slots=4096
lock_file=/var/log/nagios/nagios.lock
temp_file=/var/log/nagios/nagios.tmp
temp_path=/tmp
event_broker_options=8
broker_module=/usr/lib64/nagios/ndomod.o config_file=/etc/nagios/ndomod.cfg
log_rotation_method=m
log_archive_path=/var/log/nagios/archives
use_syslog=1
log_notifications=1
log_service_retries=1
log_host_retries=1
log_event_handlers=1
log_initial_states=0
log_external_commands=1
log_passive_checks=1
service_inter_check_delay_method=s
max_service_check_spread=30
service_interleave_factor=s
host_inter_check_delay_method=s
max_host_check_spread=30
max_concurrent_checks=0
check_result_reaper_frequency=10
max_check_result_reaper_time=20
check_result_path=/var/log/nagios/spool/checkresults
max_check_result_file_age=3600
cached_host_check_horizon=15
cached_service_check_horizon=15
enable_predictive_host_dependency_checks=1
enable_predictive_service_dependency_checks=1
soft_state_dependencies=0
auto_reschedule_checks=0
auto_rescheduling_interval=30
auto_rescheduling_window=180
sleep_time=0.25
service_check_timeout=60
host_check_timeout=30
event_handler_timeout=30
notification_timeout=60
ocsp_timeout=5
perfdata_timeout=5
retain_state_information=1
state_retention_file=var/log/nagios/retention.dat
retention_update_interval=60
use_retained_program_state=1
use_retained_scheduling_info=1
retained_host_attribute_mask=0
retained_service_attribute_mask=0
retained_process_host_attribute_mask=0
retained_process_service_attribute_mask=0
retained_contact_host_attribute_mask=0
retained_contact_service_attribute_mask=0
interval_length=60
use_aggressive_host_checking=0
execute_service_checks=1
accept_passive_service_checks=1
execute_host_checks=1
accept_passive_host_checks=1
enable_notifications=1
enable_event_handlers=1
process_performance_data=0
translate_passive_host_checks=0
passive_host_checks_are_soft=0
check_for_orphaned_services=1
check_for_orphaned_hosts=1
check_service_freshness=1
service_freshness_check_interval=60
check_host_freshness=0
host_freshness_check_interval=60
additional_freshness_latency=15
enable_flap_detection=1
low_service_flap_threshold=5.0
high_service_flap_threshold=20.0
low_host_flap_threshold=5.0
high_host_flap_threshold=20.0
date_format=us
enable_embedded_perl=0
use_embedded_perl_implicitly=0
illegal_object_name_chars=`~!$%^&*|'"<>?,()=
illegal_macro_output_chars=`~$&|'"<>
use_regexp_matching=0
use_true_regexp_matching=0
admin_email=sr2690 at columbia.edu
daemon_dumps_core=0
use_large_installation_tweaks=1
enable_environment_macros=1
debug_level=-1
debug_verbosity=2
debug_file=/var/log/nagios/nagios.debug
max_debug_file_size=1000000
On Wed, Apr 8, 2009 at 1:56 AM, fancyrabbit <fancyrabbit at gmail.com> wrote:
> i met almost the same issue.
> after tweaking enable_embedded_perl=0, the load average was brought up but
> latencies became lower.
>
> On Wed, Apr 8, 2009 at 11:54 AM, shadih rahman <shadhin71 at gmail.com>wrote:
>
>> I am seeing a ton of orphaned error message for both services and hosts.
>> I am running nagios on a quad core 2.2 GHZ machine running 4 GHZ memory. I
>> will paste my configuration file below. I have the machine sending ndo to a
>> local database sitting on a 170 GB Hard drive. nagios is obcessing on both
>> host and services and sending data to a machine with identical
>> configuration. I am doing failover using NSCA. Please advise on this.
>>
>>
>>
>>
>>
>> nagios.cfg
>>
>>
>>
>> log_file=/var/log/nagios/nagios.log
>> cfg_file=/etc/nagios/commands.cfg
>> cfg_file=/etc/nagios/contacts.cfg
>> cfg_file=/etc/nagios/timeperiods.cfg
>> cfg_file=/etc/nagios/templates.cfg
>> cfg_dir=/etc/nagios/hosts
>> cfg_dir=/etc/nagios/services
>> object_cache_file=/var/log/nagios/objects.cache
>> precached_object_file=/var/log/nagios/objects.precache
>> resource_file=/etc/nagios/resource.cfg
>> status_file=/var/log/nagios/status.dat
>> status_update_interval=60
>> nagios_user=nagios
>> nagios_group=nagios
>> check_external_commands=1
>> command_check_interval=-1
>> command_file=/var/log/nagios/rw/nagios.cmd
>> external_command_buffer_slots=8192
>> lock_file=/var/log/nagios/nagios.lock
>> temp_file=/var/log/nagios/nagios.tmp
>> temp_path=/tmp
>> event_broker_options=8
>> broker_module=/usr/lib64/nagios/ndomod.o
>> config_file=/etc/nagios/ndomod.cfg
>> log_rotation_method=m
>> log_archive_path=/var/log/nagios/archives
>> use_syslog=1
>> log_notifications=1
>> log_service_retries=1
>> log_host_retries=1
>> log_event_handlers=1
>> log_initial_states=0
>> log_external_commands=1
>> log_passive_checks=1
>> service_inter_check_delay_method=n
>> max_service_check_spread=30
>> service_interleave_factor=s
>> host_inter_check_delay_method=s
>> max_host_check_spread=30
>> max_concurrent_checks=0
>> check_result_reaper_frequency=2
>> max_check_result_reaper_time=10
>> check_result_path=/var/log/nagios/spool/checkresults
>> max_check_result_file_age=3600
>> cached_host_check_horizon=15
>> cached_service_check_horizon=15
>> enable_predictive_host_dependency_checks=1
>> enable_predictive_service_dependency_checks=1
>> soft_state_dependencies=1
>> auto_reschedule_checks=1
>> auto_rescheduling_interval=30
>> auto_rescheduling_window=180
>> sleep_time=0.25
>> service_check_timeout=30
>> host_check_timeout=20
>>
>> event_handler_timeout=30
>> notification_timeout=60
>> ocsp_timeout=5
>> perfdata_timeout=5
>> retain_state_information=1
>> state_retention_file=var/log/nagios/retention.dat
>> retention_update_interval=60
>> use_retained_program_state=1
>> use_retained_scheduling_info=1
>> retained_host_attribute_mask=0
>> retained_service_attribute_mask=0
>> retained_process_host_attribute_mask=0
>> retained_process_service_attribute_mask=0
>> retained_contact_host_attribute_mask=0
>> retained_contact_service_attribute_mask=0
>> interval_length=60
>> use_aggressive_host_checking=0
>> execute_service_checks=1
>> accept_passive_service_checks=1
>> execute_host_checks=1
>> accept_passive_host_checks=1
>> enable_notifications=1
>> enable_event_handlers=1
>> process_performance_data=0
>> obsess_over_services=1
>> ocsp_command=send_service_check
>> ochp_command=send_host_check
>> obsess_over_hosts=1
>> translate_passive_host_checks=0
>> passive_host_checks_are_soft=0
>> check_for_orphaned_services=1
>> check_for_orphaned_hosts=1
>> check_service_freshness=1
>> service_freshness_check_interval=60
>> check_host_freshness=0
>> host_freshness_check_interval=60
>> additional_freshness_latency=15
>> enable_flap_detection=1
>> low_service_flap_threshold=5.0
>> high_service_flap_threshold=20.0
>> low_host_flap_threshold=5.0
>> high_host_flap_threshold=20.0
>> date_format=us
>> enable_embedded_perl=1
>> use_embedded_perl_implicitly=1
>> illegal_object_name_chars=`~!$%^&*|'"<>?,()=
>> illegal_macro_output_chars=`~$&|'"<>
>> use_regexp_matching=0
>> use_true_regexp_matching=0
>> admin_email=sr2690 at columbia.edu
>> daemon_dumps_core=0
>> use_large_installation_tweaks=1
>> enable_environment_macros=1
>> debug_level=-1debug_verbosity=2
>> debug_file=/var/log/nagios/nagios.debug
>> max_debug_file_size=1000000
>>
>>
>>
>>
>> my nagiostats output
>>
>>
>>
>>
>>
>>
>>
>> [sr2690>nagiostats
>>
>> Nagios Stats 3.0.6
>> Copyright (c) 2003-2008 Ethan Galstad (www.nagios.org)
>> Last Modified: 12-01-2008
>> License: GPL
>>
>> CURRENT STATUS DATA
>> ------------------------------------------------------
>> Status File: /var/log/nagios/status.dat
>> Status File Age: 0d 0h 0m 19s
>> Status File Version: 3.0.6
>>
>> Program Running Time: 0d 2h 5m 28s
>> Nagios PID: 12139
>> Used/High/Total Command Buffers: 0 / 0 / 8192
>>
>> Total Services: 2783
>> Services Checked: 2783
>> Services Scheduled: 2782
>> Services Actively Checked: 2783
>> Services Passively Checked: 0
>> Total Service State Change: 0.000 / 52.830 / 0.263 %
>> Active Service Latency: 1.304 / 12092.843 / 1469.130 sec
>> Active Service Execution Time: 0.011 / 15.103 / 0.468 sec
>> Active Service State Change: 0.000 / 52.830 / 0.263 %
>> Active Services Last 1/5/15/60 min: 0 / 0 / 0 / 129
>> Passive Service Latency: 0.000 / 0.000 / 0.000 sec
>> Passive Service State Change: 0.000 / 0.000 / 0.000 %
>> Passive Services Last 1/5/15/60 min: 0 / 0 / 0 / 0
>> Services Ok/Warn/Unk/Crit: 2560 / 13 / 186 / 24
>> Services Flapping: 17
>> Services In Downtime: 0
>>
>> Total Hosts: 3037
>> Hosts Checked: 3005
>> Hosts Scheduled: 3029
>> Hosts Actively Checked: 3037
>> Host Passively Checked: 0
>> Total Host State Change: 0.000 / 53.620 / 0.227 %
>> Active Host Latency: 0.000 / 12080.792 / 3770.409 sec
>> Active Host Execution Time: 0.000 / 104.093 / 2.500 sec
>> Active Host State Change: 0.000 / 53.620 / 0.227 %
>> Active Hosts Last 1/5/15/60 min: 0 / 0 / 0 / 256
>> Passive Host Latency: 0.000 / 0.000 / 0.000 sec
>> Passive Host State Change: 0.000 / 0.000 / 0.000 %
>> Passive Hosts Last 1/5/15/60 min: 0 / 0 / 0 / 0
>> Hosts Up/Down/Unreach: 2849 / 188 / 0
>> Hosts Flapping: 10
>> Hosts In Downtime: 0
>>
>> Active Host Checks Last 1/5/15 min: 0 / 0 / 1
>> Scheduled: 0 / 0 / 0
>> On-demand: 0 / 0 / 1
>> Parallel: 0 / 0 / 0
>> Serial: 0 / 0 / 0
>> Cached: 0 / 0 / 1
>> Passive Host Checks Last 1/5/15 min: 0 / 0 / 0
>> Active Service Checks Last 1/5/15 min: 0 / 0 / 0
>> Scheduled: 0 / 0 / 0
>> On-demand: 0 / 0 / 0
>> Cached: 0 / 0 / 0
>> Passive Service Checks Last 1/5/15 min: 0 / 0 / 0
>>
>> External Commands Last 1/5/15 min: 0 / 0 / 0
>>
>>
>>
>>
>>
>>
>> --
>> Cordially,
>> Shadhin Rahman
>>
>>
>> ------------------------------------------------------------------------------
>> This SF.net email is sponsored by:
>> High Quality Requirements in a Collaborative Environment.
>> Download a free trial of Rational Requirements Composer Now!
>> http://p.sf.net/sfu/www-ibm-com
>> _______________________________________________
>> Nagios-users mailing list
>> Nagios-users at lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nagios-users
>> ::: Please include Nagios version, plugin version (-v) and OS when
>> reporting any issue.
>> ::: Messages without supporting info will risk being sent to /dev/null
>>
>
>
>
> ------------------------------------------------------------------------------
> This SF.net email is sponsored by:
> High Quality Requirements in a Collaborative Environment.
> Download a free trial of Rational Requirements Composer Now!
> http://p.sf.net/sfu/www-ibm-com
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
--
Cordially,
Shadhin Rahman
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20090409/00e26554/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
This SF.net email is sponsored by:
High Quality Requirements in a Collaborative Environment.
Download a free trial of Rational Requirements Composer Now!
http://p.sf.net/sfu/www-ibm-com
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list