Some alerts not getting to sendmail
Tim Palmer
tim at tany.com
Thu Nov 18 17:13:48 CET 2010
Andreas Ericsson wrote:
> On 11/18/2010 03:48 PM, Tim Palmer wrote:
>
>> Good morning, or whatever as the case may be...
>>
>> I have a Nagios 3.2.1install which is showing a problem I'm unsure how
>> to troubleshoot further. It's either something simple I'm missing, or a
>> deeper, more difficult problem. Or a transient to be perhaps put on a
>> shelf until it happens again.
>>
>> First, the questions:
>> - Is the notifications log absolute?
>> - Meaning, if a notification is shown in this log, it has passed all
>> filters (notification options etc) and Nagios believes it was submitted
>> to the MTA.
>>
>>
>
> Yes.
>
Excellent, thank you. That's the critical bit for me regarding Nagios.
>
>> - Is there anywhere besides the MTA's log,status.dat and nagios.log to
>> look for clues to mail problems?
>>
>
> The receiving end comes to mind, or any server(s) in between.
>
>
>> ==============
>> Details
>> - Running on FreeBSD 7.0, using stock sendmail on localhost.
>> - In general, everything is working fine. 125 hosts, 1600 ish services.
>> This system has been up and stable for a few months.
>>
>> Host and service notifications of all kinds go out properly all the time.
>>
>> Last night, I had a host go down. Notification got to my cell phone and
>> the other contacts it's configured to just fine. This morning, I dealt
>> with the problem host and Nagios showed it back up. But no Host up
>> notification to any of the configured contacts. The Notifications log
>> shows the host up notifications as having been sent. There's nothing in
>> /var/log/maillog for the time Nagios says the notifications were sent.
>> In status.dat, the record for my cell contact has a
>> "last_host_notification" line with the epoch time version of the exact
>> second the notification was in theory sent. Host and template records
>> included at the bottom of this email. I've included one contact def, but
>> there were 4 contacts, using 2 different scripts that should have
>> received the notification.
>>
>> As far as I can see, there is nothing in the host configuration or
>> related templates that would keep a host up notification from being sent.
>>
>> We use custom host-notify scripts which log actions, and again, no
>> entries for the specific problem, but lots of other notifications before
>> and after. These scripts could be the problem, but I want to rule out
>> other issues first.
>>
>>
>
> Notifications are a pretty integral part to what makes Nagios worth
> anything at all. Since you're using homebrewed scripts and noone else
> has reported any problems with them, I suggest you first debug your
> own scripts, or enable debug-logging for notifications. The dosc will
> tell you how to do that. It won't help for this occurrance of the
> failed notifications, but it will definitely help you in the future
> if it ever happens again.
>
>
Agreed on all counts. Now that you've confirmed the final-ness of the
notifications log, I am comfortable looking outside Nagios to the
scripts, system and sendmail. I'm sure there's a reasonable, logical
explanation for a small subset of mail not getting from Nagios to the
local MTA...
Thank you
Tim
------------------------------------------------------------------------------
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3.
Spend less time writing and rewriting code and more time creating great
experiences on the web. Be a part of the beta today
http://p.sf.net/sfu/msIE9-sfdev2dev
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list