How to troubleshoot when not receiving alerts?]

John Oliver joliver at john-oliver.net
Thu Jul 24 23:59:10 CEST 2008


On Thu, Jul 24, 2008 at 01:53:19PM -0500, Marc Powell wrote:
> 
> On Jul 24, 2008, at 12:59 PM, John Oliver wrote:
> 
> > I have one alert set up that should be emailing every time it runs...
> > it's a disk space check on a server that has 1% left.  However, I am  
> > not
> > receiving any emails.  How to go about figuring out why?
> 
> Generally --
> 
> See if you have notification alerts in nagios.log for it.
> 	if no, verify the notification options for the service defintion,  
> contactgroup and contacts.

No, nothing is getting logged.  But then, there are very few logs
compared to the number of hosts / services it's monitoring... it looks
like only emails are being logged.  I looked in nagios.cfg for a logging
level type of option, but no dice.

It was working yesterday.  I was getting emails from this plugin every
24 minutes (notification_interval was 1440).  They were all errors.  I
thought I had the errors fixed... the last email I got said RECOVERED
(even though I should be getting CRITICAL alerts, as there is 1% disk
space left).  I changed the notification_interval, and never saw another
email.

This AM, I set notification_interval to 60  I should get an email every
minute.  I'm not.  And, yes, I'm restarting nagios ;-)

Here's the stanza in services.cfg:

define service{
        use                             generic-service         ; Name
of service template to use
        host_name                       ftp
        service_description             Disk Space
        is_volatile                     0
        check_period                    normalbusinesshours
        max_check_attempts              3
        normal_check_interval           120
        retry_check_interval            10
        contact_groups                  FTP_Alerts
        notification_interval           60
        notification_period             normalbusinesshours
        notification_options            w,u,c,r
        check_command                   check_remote_disk1
        register                        1
        }

hosts.cfg:

define host{
        use                     generic-host            ; host template
to use
        host_name               ftp
        alias                   ftp.domain.com
        address                 10.11.12.13
#        check_command           check-host-alive
        max_check_attempts      10
        notification_interval   1800
        notification_period     24x7
        notification_options    d,u,r
        contact_groups          HelpDesk
        }

checkcommands.cfg:

define command{
        command_name    check_remote_disk1
        command_line    $USER1$/check_nrpe -H $HOSTADDRESS$ -c
check_disk
        }


And I can check the remote system from the command line:

[root at cerberus ~]# /usr/lib/nagios/plugins/check_nrpe -H ftp -c
check_disk
DISK OK - free space: / 2321 MB (1% inode=99%);|
/=133114MB;142786;142796;0;142806


Yes, I just noticed the discrepancy between contact_groups in
services.cfg and hosts.cfg  I doubt that's the issue, as I was getting
emails yesterday.

Any help appreciated!

-- 
***********************************************************************
* John Oliver                             http://www.john-oliver.net/ *
*                                                                     *
***********************************************************************

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list