Attempting to monitor the "Nagios Server" itself
Marc Powell
marc at ena.com
Wed Aug 13 00:33:57 CEST 2008
Please always respond on list so that others, now and in the future,
obtain the benefit of your experience. More below...
On Aug 12, 2008, at 4:36 PM, Bret Goodfellow wrote:
> Hi Marc,
>
> Thanks for the quick response! I added the following host definition:
>
> ######################################################################################
> # 'colorado' host
> definition #
> ######################################################################################
> define host{
> use generic-host ; Name of
> host template to use
>
> host_name colorado
> alias colorado
> address 10.8.64.201
> check_command check-host-alive
> contact_groups linux-admins,linux-admins-page,oracle-
> admins
> max_check_attempts 10
> max_check_attempts 10
> notification_interval 480
> notification_period 24x7
> notification_options d,u,r
> }
>
> After adding this definition, I noticed on the nagios monitor that I
> have "1" pending host. This "pending status" never changes. The
> pending host of course, is colorado.
Pending is normal for a host with no services. It'll never be checked.
That's expected.
> Adding the host definition for "colorado" only, does not cause
> nagios to fail.
Good to know.
> The failure occurs when I add the attached "services" config file.
> If I remove the "colorado" services config file, then nagios starts
> up and runs fine. My belief though, is that there is something
> wrong with the "host" definition file.
Why do you believe that? It looks OK to me. I would think the problem
is somehow associated with the newly included services file or
something external to nagios.
> Since the server I want to monitor is the "localhost", do I need to
> replace the alias with the name "localhost".
No, the alias doesn't matter, it's just for humans to know what the
machine is. I don't see anything obviously wrong with the file*. I
would try adding it in chunks of a few definitions at a time and see
which one causes nagios to segfault. Jon Agliss's strace suggestion is
good as well. I'd use 'strace -fs512 /usr/local/nagios/bin/nagios -d /
usr/local/nagios/etc/nagios.cfg' myself.
* Some of the tests you're doing, just based on their names, could be
implemented better. For example, why are you ssh'ing to the same
machine, essentially a localhost IP, (check_ssh_disk) or using snmp
(check_snmp_storage) to check disks when it seems that just using
check_disk would do the same thing without the hoops and points-of-
failure.
--
Marc
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list