newbie question...
Andrew Davis
nccomp at gmail.com
Mon Jun 1 21:00:11 CEST 2009
Marc Powell wrote:
> On Jun 1, 2009, at 3:09 AM, Arnar Þórarinsson wrote:
>
>
>> So there is no way of getting just the host down alert when a host
>> goes down ?
>>
>> To explain a little, lets say I'm monitoring CPU, memory and disk
>> space on a host.
>> The host goes down and Nagios sends an alert by email for the host
>> down event and
>> also for the CPU, memory and disk space events. All I need to know
>> about this event is that the host is down.
>> Just think that it's not neccessary to send an alert email about
>> services on a host that is down.
>>
>
> And so does nagios. As I said earlier, nagios does this automatically.
> To restate - when a host is down, nagios suppresses all e-mail
> notifications about that hosts services, but will still display them
> as down in the GUI. It will only send the host down notification.
>
> The first section of http://nagios.sourceforge.net/docs/2_0/networkreachability.html
> states it best. It still applies to 3.x but I haven't found the
> section that states it as clearly.
>
> --
> Marc
>
If I'm interpreting your question correctly, you're saying that when one
of your servers actually goes down, you ARE getting alerts
(email/SMS/whatever) for more than just the host being down??? I see
what Marc's saying... he's telling you this shouldn't be. Nagios was
built to check first that the host is up and reachable, and if its not
to notify you the host is down, but to not ALERT you about all
host-dependent tests that are now failing. Nagios will still try all
tests and fail on them and the web interface will reflect more than just
the HOST DOWN, but the only email/SMS you get should be for the HOST DOWN.
However, you may need to clarify what you mean by *down*. *Down* does
not always mean off or 100% non-responsive. In the case of *nix systems
I've seen quite a few times where a server will hang, fail, or segfault
but still be reachable over the network. The reason is that parts of the
OS are in memory and things like pings from remote hosts still respond,
even though the overall functionality of the host itself is down (ISP's
get this a lot: host pings, but you can't ssh in, for example). If
Nagios can ping the host, it will then try the other tests and alert on
them. Here's a quick way to narrow this down: turn off the server (shut
down and pull power). The Nagios web interface should show the host down
and all tests as failing, but the only email/SMS you should get is the
host down. If you still get emailed/alerted then you might have a
configuration error. Perhaps you didn't properly define your host checks
as opposed to service checks? Do you have a check_ping or check_icmp
host check for each host?
AD
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20090601/546ad8a1/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
OpenSolaris 2009.06 is a cutting edge operating system for enterprises
looking to deploy the next generation of Solaris that includes the latest
innovations from Sun and the OpenSource community. Download a copy and
enjoy capabilities such as Networking, Storage and Virtualization.
Go to: http://p.sf.net/sfu/opensolaris-get
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list