Using Nagios to monitor "service-less" hosts
Tedman Eng
teng at dataway.com
Thu Nov 9 00:30:51 CET 2006
Are you sure you've haven't got check_interval configured in the host
directive, or inherited from the template being applied? The "active
checks" setting does something different.
"check_interval" = how often to perform SCHEDULED host checks
"active_checks_enabled" = whether or not Nagios executes a check when
needed
"when needed" can be be triggered by "host check_interval" or "service
non-ok"
With retention turned on, some settings are retained and thus ignore changes
in .cfg files.
For more detailed info,
http://nagios.sourceforge.net/docs/2_0/xodtemplate.html#retention_notes
NOTE: The retention notes specify that it only applies to settings changed
during runtime, but I've seen cases where undefining a setting does not
"clear" the setting if it's been set in the past through a .cfg file.
For example enabling "flap detection" in .cfg, and then later not defining
it in the .cfg left the host with flap detection enabled.
> -----Original Message-----
> From: Andy Shellam (Mailing Lists)
> [mailto:andy.shellam-lists at mailnetwork.co.uk]
> Sent: Wednesday, November 08, 2006 2:45 PM
> To: Tedman Eng
> Cc: nagios-users at lists.sourceforge.net
> Subject: Re: [Nagios-users] Using Nagios to monitor
> "service-less" hosts
>
>
> Ted,
>
> I've stopped Nagios, removed all ".dat" files from var, and
> restarted it
> - all checks are now pending.
> However, I did look through retention.dat (I presume this is what you
> meant - status.sav didn't exist) before I killed it, and the
> check_interval parameter was not defined for any host.
>
> I would think, surely, "state retention" only retains the
> service/host
> check states so if, for example, the Nagios machine reboots, when it
> comes back up it knows where it left off. Otherwise if you
> change the
> config, you'd have to remember to remove all the .dat files
> (or at least
> retention.dat) in var before the config change takes effect, and I
> certainly haven't had to do that before.
>
> And as far as Nagios was concerned, "scheduled active host
> checks" were
> OFF - or so it said in the config viewer.
>
> I'll wait a couple of minutes, see where it goes from here......
>
> OK 5 minutes have passed - no different.
> Service SC-Gateway - Ping = checked and confirmed OK at 22:38:15
> The host SC-Gateway = checked and confirmed OK at 22:38:15,
> then checked
> again at 22:39:40 and again at 22:41:40.
> And note "Next active scheduled check" reads N/A.
>
> Andy.
>
> Tedman Eng wrote:
> > If you have state retention enabled, then Nagios remembers
> lots of settings
> > and does not "reset" them when reloading a config
> (otherwise it wouldn't be
> > retaining). "Host Active Checks Enabled" likely did not
> disable themselves
> > after changing the .cfg file, because the state was
> "remembered" from
> > previous runs. Try stopping Nagios, clearing the
> status.sav and restarting
> > Nagios.
> >
> >
> >> -----Original Message-----
> >> From: Andy Shellam (Mailing Lists)
> >> [mailto:andy.shellam-lists at mailnetwork.co.uk]
> >> Sent: Wednesday, November 08, 2006 12:58 PM
> >> To: Tedman Eng
> >> Cc: nagios-users at lists.sourceforge.net
> >> Subject: Re: [Nagios-users] Using Nagios to monitor
> >> "service-less" hosts
> >>
> >>
> >> Hi Ted,
> >>
> >> I understand the distinction - I *did* have host checks actively
> >> scheduled (ie. the host parameter 'check_interval' set to
> 1 - this is
> >> now 0 so host checks shouldn't be scheduled, right?) Yet
> Nagios IS
> >> checking the hosts every few minutes roughly, regardless of child
> >> service status.
> >>
> >> Here's a dead simple example - the FH-Gateway - it has a
> >> single service,
> >> which is a Ping. The host also has a Ping set as it's
> >> active_check_command parameter.
> >> Now, if I show you the service breakdown for the Ping _service_ on
> >> FH-Gateway:
> >>
> >> Current Status:
> >> OK
> >> Status Information: PING OK - Packet loss = 0%, RTA
> = 3.02 ms
> >> Performance Data:
> >> Current Attempt: 1/2
> >> State Type: HARD
> >> Last Check Type: ACTIVE
> >> Last Check Time: 08-11-2006 20:49:37
> >> Status Data Age: 0d 0h 0m 51s
> >> Next Scheduled Active Check: 08-11-2006 20:50:37
> >> Latency: 0.607 seconds
> >> Check Duration: 9.013 seconds
> >> Last State Change: 08-11-2006 10:46:46
> >> Current State Duration: 0d 10h 3m 42s
> >>
> >>
> >> Nagios reports it's been in the same state (ie. OK) for 10
> hours, 3
> >> minutes, and 42 seconds right?
> >> So why was the host checked only a few seconds ago?
> >>
> >> Host Status:
> >> UP
> >> Status Information: PING OK - Packet loss = 0%, RTA
> = 0.27 ms
> >> Performance Data:
> >> Current Attempt: 1/2
> >> State Type: HARD
> >> Last Check Type: ACTIVE
> >> Last Check Time: 08-11-2006 20:50:49
> >> Status Data Age: 0d 0h 0m 39s
> >> Next Scheduled Active Check: N/A
> >> Latency: 9.113 seconds
> >> Check Duration: 9.011 seconds
> >> Last State Change: 07-11-2006 06:20:35
> >> Current State Duration: 1d 14h 30m 53s
> >> Last Host Notification: N/A
> >> Current Notification Number: 0
> >> Is This Host Flapping?
> >> NO
> >> Percent State Change: 0.00%
> >> In Scheduled Downtime?
> >> NO
> >> Last Update: 08-11-2006 20:51:16
> >>
> >>
> >> If the general line of thinking is correct, Nagios should
> have last
> >> checked the host back at (or around) 10:46 this morning when
> >> there was a
> >> blip in the service check. But it didn't. It does check
> >> them every 1-2
> >> minutes.
> >> My check_interval parameter is 0 - the config viewer in
> the web CGIs
> >> shows "enabled active checks" as NO for each host.
> >>
> >> Since I've been writing this - the above host has been
> >> checked again at
> >> 20:54:49 - exactly 4 minutes since the last check. No
> change in the
> >> service status - 10 hours, 9 minutes now.
> >>
> >> Any ideas?
> >>
> >> Andy.
> >>
> >>
> >>
> >> Tedman Eng wrote:
> >>
> >>> Host checks are not actively scheduled in normal operation.
> >>>
> >>> You could go months without requiring a host check, and the
> >>>
> >> status age of
> >>
> >>> the host check will show something like 81 days for example.
> >>>
> >>> If you see recent host checks, then that means there was a
> >>>
> >> service problem
> >>
> >>> and Nagios wanted to be sure it wasn't the host.
> >>>
> >>> Perhaps if you thought of "host check" as "network link
> >>>
> >> status", it would
> >>
> >>> make the distinction more clear.
> >>>
> >>>
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Andy Shellam (Mailing Lists)
> >>>> [mailto:andy.shellam-lists at mailnetwork.co.uk]
> >>>> Sent: Wednesday, November 08, 2006 11:56 AM
> >>>> To: Sloane, Robert Raymond
> >>>> Cc: nagios-users at lists.sourceforge.net
> >>>> Subject: Re: [Nagios-users] Using Nagios to monitor
> >>>> "service-less" hosts
> >>>>
> >>>>
> >>>> Sloane, Robert Raymond wrote:
> >>>>
> >>>>
> >>>>>> Last Check Time: 08-11-2006 19:34:40
> >>>>>> Next Scheduled Active Check: N/A
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>> Interesting. Nagios thinks the last check was run over a
> >>>>>
> >> month ago.
> >>
> >>>>>
> >>>>>
> >>>>>
> >>>> No, thankfully! That date is the 8th November (British format.)
> >>>>
> >>>>
> >>>>> You wouldn't see anything about hosts in the scheduling
> >>>>>
> >> queue. Host
> >>
> >>>>> checks are run immediately, not through the queue. That is
> >>>>>
> >>>>>
> >>>> why it is
> >>>>
> >>>>
> >>>>> best to not use them.
> >>>>>
> >>>>>
> >>>>>
> >>>> I did when the check_interval was set to 1 in the hosts - it
> >>>> showed the
> >>>> host name and a blank service column.
> >>>> I'd mentioned this only to prove the point that the checks do
> >>>> not seem
> >>>> to be scheduled any more, so I cannot figure out why it's
> >>>> still running
> >>>> the host checks at (seemingly) regular intervals.
> >>>>
> >>>> There are no hosts under that machine (or indeed above
> >>>>
> >> it), and all
> >>
> >>>> services checks are up and have been for a good 6-8 hours.
> >>>>
> >>>> I'm stumped!
> >>>>
> >>>> Andy.
> >>>>
> >>>> --------------------------------------------------------------
> >>>> -----------
> >>>> Using Tomcat but need to do more? Need to support web
> >>>> services, security?
> >>>> Get stuff done quickly with pre-integrated technology to make
> >>>> your job easier
> >>>> Download IBM WebSphere Application Server v.1.0.1 based on
> >>>> Apache Geronimo
> >>>> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&
> >>>>
> >>>>
> >>> dat=121642
> >>> _______________________________________________
> >>> Nagios-users mailing list
> >>> Nagios-users at lists.sourceforge.net
> >>> https://lists.sourceforge.net/lists/listinfo/nagios-users
> >>> ::: Please include Nagios version, plugin version (-v) and
> >>>
> >> OS when reporting
> >>
> >>> any issue.
> >>> ::: Messages without supporting info will risk being sent
> >>>
> >> to /dev/null
> >>
> >>>
> >>>
> >>>
> >>>
> >> --------------------------------------------------------------
> >> -----------
> >> Using Tomcat but need to do more? Need to support web
> >> services, security?
> >> Get stuff done quickly with pre-integrated technology to make
> >> your job easier
> >> Download IBM WebSphere Application Server v.1.0.1 based on
> >> Apache Geronimo
> >> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&
> >>
> > dat=121642
> > _______________________________________________
> > Nagios-users mailing list
> > Nagios-users at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/nagios-users
> > ::: Please include Nagios version, plugin version (-v) and
> OS when reporting
> > any issue.
> > ::: Messages without supporting info will risk being sent
> to /dev/null
> >
> > !DSPAM:37,455258fb40411755016805!
> >
> >
> >
>
>
> --------------------------------------------------------------
> -----------
> Using Tomcat but need to do more? Need to support web
> services, security?
> Get stuff done quickly with pre-integrated technology to make
> your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on
> Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&
dat=121642
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting
any issue.
::: Messages without supporting info will risk being sent to /dev/null
-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list