Phantom service checks

Rasmus Plewe rplewe at ess.nec.de
Thu Dec 5 14:14:12 CET 2002


Hello,

the only thing I found about this issue was in the mailing list
archive from last week monday, but no response. 

Once upon a time I had a service check, which was associated with a
couple of hosts and hostgroups. Now I don't have this service any
more, even the command definition in checkcommands.cfg is deleted. 
When doing a recursive grep over the Nagios directory, the only files
where this service name appears are the log files. But every now and
again I get notifications telling me that this service is critical or
up (it being so unreliable was one of the reasons to eliminate it in
the first place). How can I get rid of this?

Another thing:
During a greater downtime yesterday night, I had the opportunity to
test the "scheduled downtime" functionality. What I think what
happened is the following:
- downtime started. Lots of mails were generated, about every host and
  service that was configured. 
- I scheduled downtime for the time being. Still notifications were
  sent out (yes, I restarted Nagios). 
- I removed certain email adresses (like "half of the company" - oops)
  from getting notifications by setting the *_notification_periods in
  contacts.cfg to "none". Restarted Nagios. Still notifications were
  sent. 
- I changed the email addresses in contacts so that they didn't point
  any more to these email aliases. Restarted Nagios. Still
  notifications were sent. 

All in all I got the impression that Nagios does not care too much
about changed configurations when getting restarted. But then I can't
swear that I didn't screw it up somehow, since I was pretty much tied
up in the downtime and hadn't a lot of time playing with Nagios at the
same time.

Is there anyone who could make sense of this, and preferably have a
solution how I get rid of that phantom check? 

Oh, and another thought: I guess there's no possibility to tell Nagios
to "condense" notifications? I mean, in a situation like yesterday it
would be handy to have one notificaton for all incidents insteead of
~150 mails. Something like "upon a failure wait x minutes before
sending a notification, if there's another failure include it into the
notification and wait another x minutes. But don't wait longer than
y(>x) minutes counted from the first failure on" would be really
cool... 


Regards,
         Rasmus


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf




More information about the Users mailing list