Problems with service scheduling
Marcus Fleige
mfleige at rhenus.de
Tue Aug 29 10:42:06 CEST 2006
Hi list,
i recently ran into problems with the service scheduling inside my
nagios installation and i guess i need some help. I'm running Nagios 2.5
with roundabout 300 Hosts ans 2800 services on a 4-CPU Xeon machine with
2 GB RAM. Services are actively monitored.
Thing is: I changed the configuration last week to better fit our needs.
That included a lot of renaming of Services, Contacts, Contactgroups and
Escalations. After I finished, i restarted the nagios daemon yesterday
morning at about 9am. Result: the process doesn't start monitoring. I
looked into the scheduling queue, and it told me it will start the
monitoring at 5pm in the evening.
Over the day, i tried to analyse the problem. I reviewed the config,
although Nagios verificates it to be good, finding nothing. Restart of
the process has no effect, the scheduling queue doesn't change.
I tried with the old config (praise svn!), and the Process starts as
usual, generating a new scheduling queue and beginning the monitoring.
As the only file influencing the schedule queue is the main config file
and altough I did not change that, i copied it again from the old to the
new config. It didn't show any effect, at least it shows the error seems
to live in the host/service/escalation area of my config.
When i restarted the Nagios daemon at 4:50pm, waiting till 5pm, i began
to monitor as he was expected to. That worked till today morning, when
at around 9am (again!) the scheduling queue showed up with the 5pm-thing
again.
I recompiled nagios with DEBUG1-3 to get some more information. After
validating te config, it shows the following:
[...]
Completed service verification checks
Completed host verification checks
Completed hostgroup verification checks
Completed servicegroup verification checks
Completed contact verification checks
Completed contact group verification checks
Completed service escalation checks
Completed service dependency checks
Completed host escalation checks
Completed host dependency checks
Completed command checks
Completed command checks
Completed extended host info checks
Completed extended service info checks
Completed circular path checks
Completed circular host and service dependency checks
Completed global event handler command checks
Completed obsessive compulsive processor command checks
$0: Cannot enter daemon mode with DEBUG option(s) enabled. We'll run as
a foreground process instead...
COMMAND FILE THREAD: 1077427120
Preferred Time: 1156839911 --> Tue Aug 29 10:25:11 2006
Next Valid Time: 1156863600 --> Tue Aug 29 17:00:00 2006
Preferred Time: 1156839911 --> Tue Aug 29 10:25:11 2006
Next Valid Time: 1156863600 --> Tue Aug 29 17:00:00 2006
[...]
Host 'AP001' should not be scheduled
Host 'AP002' should not be scheduled
Host 'AP003' should not be scheduled
Host 'AP004' should not be scheduled
Host 'AP005' should not be scheduled
[...]
Total scheduled services: 2837
Service Interleave factor: 1
Total service interleave blocks: 2837
Service inter-check delay: 1.0
Current Interleave Block: 0
Service 'Network: Ping' on host 'AP001'
CIB: 0, IBI: 1, TIB: 2837, SIF: 1
Mult factor: 2837
Preferred Check Time: 1156842748 --> Tue Aug 29 11:12:28
2006
Preferred Time: 1156842748 --> Tue Aug 29 11:12:28 2006
Next Valid Time: 1156863600 --> Tue Aug 29 17:00:00 2006
Actual Check Time: 1156863600 --> Tue Aug 29 17:00:00 2006
[...]
As you can see, i also changed the service interleaving from smart to
dumb with an interleave factor of 1 to cirumvent the scheduling logic.
In vain, i guess.... :-(
Now, for my questions:
Has anyone seen such behaviour already?
Where is that "Next Valid Time" in the Debug-Output from?
How is it generated?
Is there any tool beside the daemon itself to validate the config files?
Thanks for reading all the way down here and please excuse any lingual
errors.
Regards,
Marcus Fleige
--
EOF
-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list