<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<META NAME="Generator" CONTENT="MS Exchange Server version 6.5.7232.36">
<TITLE>RE: [Nagios-users] 2.0b5 initial host/service checks delayed after start (not present in 2.0b3)</TITLE>
</HEAD>
<BODY>
<!-- Converted from text/plain format -->
<BR>
<P><FONT SIZE=2>And additionally, this release is not sending alerts at all as the config is doing in 2.0b3. The logs report the host down, but the alert and contact rules aren't being triggered. There are no errors to that effect however, either in nagios -v check or in the logs to indicate that it is attempting to alert. I've verified that the alert commands path and syntax are proper...<BR>
<BR>
Has anyone else seen this in current or 2.0b4?<BR>
<BR>
/eli<BR>
<BR>
-----Original Message-----<BR>
From: nagios-users-admin@lists.sourceforge.net on behalf of Eli Stair<BR>
Sent: Mon 11/28/2005 7:58 PM<BR>
To: nagios-users@lists.sourceforge.net<BR>
Subject: [Nagios-users] 2.0b5 initial host/service checks delayed after start (not present in 2.0b3)<BR>
<BR>
<BR>
I'm running a fresh build of 2.0b5 on x86_64. After an initial start of<BR>
nagios, it can take up to 10 minutes for the first host or service<BR>
checks to begin. There is no CPU load by the nagios process during this<BR>
time. I have over 1000 hosts to check, and have reduced the max<BR>
host/service check spread in order to ensure that it is not "evening"<BR>
out the time.<BR>
<BR>
This problem is NOT occuring on a 2.0b3 build, with the same exact<BR>
configuration.<BR>
<BR>
After the checks DO start, it can take hours to finish. I've changed<BR>
the user to root so that I can have the host check be check_icmp -t 1 -p<BR>
1.<BR>
<BR>
Unfortunately, even with this situation, having anywhere between 4 and<BR>
64 hosts go down can make the "monitoring" aspect effectively useless.<BR>
<BR>
Any suggestions on the problem of startup lag?<BR>
Any ways to further speed up the host check runs, aside from using<BR>
check_icmp?<BR>
<BR>
Thanks,<BR>
<BR>
/eli<BR>
<BR>
### inline nagios.cfg:<BR>
<BR>
<BR>
[root@monitor02 etc]# cat nagios.cfg | egrep -v "^#|^$"<BR>
log_file=/var/log/nagios/nagios.log<BR>
cfg_file=/usr/local/nagios/etc/checkcommands.cfg<BR>
cfg_file=/usr/local/nagios/etc/misccommands.cfg<BR>
cfg_dir=/usr/local/nagios/etc/config<BR>
cfg_file=/usr/local/nagios/etc/timeperiods.cfg<BR>
cfg_file=/usr/local/nagios/etc/contacts.cfg<BR>
cfg_file=/usr/local/nagios/etc/contactgroups.cfg<BR>
cfg_file=/usr/local/nagios/etc/hosts.cfg<BR>
cfg_file=/usr/local/nagios/etc/hostgroups.cfg<BR>
cfg_file=/usr/local/nagios/etc/customcommands.cfg<BR>
cfg_file=/usr/local/nagios/etc/services.cfg<BR>
object_cache_file=/usr/local/nagios/var/objects.cache<BR>
resource_file=/usr/local/nagios/etc/resource.cfg<BR>
status_file=/usr/local/nagios/var/status.dat<BR>
nagios_user=root<BR>
nagios_group=root<BR>
check_external_commands=1<BR>
command_check_interval=-1<BR>
command_file=/usr/local/nagios/var/rw/nagios.cmd<BR>
comment_file=/usr/local/nagios/var/comments.dat<BR>
downtime_file=/usr/local/nagios/var/downtime.dat<BR>
lock_file=/usr/local/nagios/var/nagios.lock<BR>
temp_file=/usr/local/nagios/var/nagios.tmp<BR>
event_broker_options=-1<BR>
log_rotation_method=d<BR>
log_archive_path=/var/log/nagios/archives<BR>
use_syslog=1<BR>
log_notifications=1<BR>
log_service_retries=1<BR>
log_host_retries=1<BR>
log_event_handlers=1<BR>
log_initial_states=0<BR>
log_external_commands=1<BR>
log_passive_checks=1<BR>
service_inter_check_delay_method=s<BR>
max_service_check_spread=15<BR>
service_interleave_factor=s<BR>
host_inter_check_delay_method=s<BR>
max_host_check_spread=10<BR>
max_concurrent_checks=0<BR>
service_reaper_frequency=15<BR>
auto_reschedule_checks=0<BR>
auto_rescheduling_interval=30<BR>
auto_rescheduling_window=180<BR>
sleep_time=0.25<BR>
service_check_timeout=60<BR>
host_check_timeout=30<BR>
event_handler_timeout=30<BR>
notification_timeout=30<BR>
ocsp_timeout=5<BR>
perfdata_timeout=5<BR>
retain_state_information=1<BR>
state_retention_file=/usr/local/nagios/var/retention.dat<BR>
retention_update_interval=0<BR>
use_retained_program_state=1<BR>
use_retained_scheduling_info=0<BR>
interval_length=60<BR>
use_aggressive_host_checking=0<BR>
execute_service_checks=1<BR>
accept_passive_service_checks=0<BR>
execute_host_checks=1<BR>
accept_passive_host_checks=1<BR>
enable_notifications=1<BR>
enable_event_handlers=1<BR>
process_performance_data=0<BR>
obsess_over_services=0<BR>
check_for_orphaned_services=0<BR>
check_service_freshness=1<BR>
service_freshness_check_interval=60<BR>
check_host_freshness=1<BR>
host_freshness_check_interval=60<BR>
aggregate_status_updates=1<BR>
status_update_interval=15<BR>
enable_flap_detection=0<BR>
low_service_flap_threshold=5.0<BR>
high_service_flap_threshold=20.0<BR>
low_host_flap_threshold=5.0<BR>
high_host_flap_threshold=20.0<BR>
date_format=iso8601<BR>
illegal_object_name_chars=`~!$%^&*|'"<>?,()=<BR>
illegal_macro_output_chars=`~$&|'"<><BR>
use_regexp_matching=0<BR>
use_true_regexp_matching=0<BR>
admin_email=nagios<BR>
admin_pager=pagenagios<BR>
daemon_dumps_core=0<BR>
<BR>
<BR>
<BR>
-------------------------------------------------------<BR>
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files<BR>
for problems? Stop! Download the new AJAX search engine that makes<BR>
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!<BR>
<A HREF="http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click">http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click</A><BR>
_______________________________________________<BR>
Nagios-users mailing list<BR>
Nagios-users@lists.sourceforge.net<BR>
<A HREF="https://lists.sourceforge.net/lists/listinfo/nagios-users">https://lists.sourceforge.net/lists/listinfo/nagios-users</A><BR>
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.<BR>
::: Messages without supporting info will risk being sent to /dev/null<BR>
<BR>
<BR>
</FONT>
</P>
</BODY>
</HTML>