Acknowledgement Escalations
Mathieu Gagné
mgagne at iweb.com
Thu Jan 22 06:26:31 CET 2009
Hi,
First, thanks for your time and input.
RijilV wrote:
> 2009/1/21 Mathieu Gagné <mgagne at iweb.com <mailto:mgagne at iweb.com>>
>
> Here is the situation:
> Somebody acknowledges a problem and forget about it.
> How would you implement an acknowledgement escalation?
>
> Mmmm, there are a couple of technology things you could do for this, but
> the root of this problem is people, not computers.
Yha. I know, you know, our managers know. However we just can't beat
them for making mistakes or being occupied by other problems. :)
> You need to work our
> a process where people aren't ack'ing things just so they can fall back
> asleep. I personally suggest having nagios create a ticket with
> whatever ticketing system you use (you use one right?!) so you can track
> that issue. That and having a 24x7 NOC helps :)
Yes. We use request tracker (RT) and I personally passed about 7 days
working on the integration of Nagios to RT and our internal customer
database.
So basically:
1) Problem: New Ticket
2) Acknowledgement: New comment about it
3) Recovery: Comment + Status=Resolved
And implemented another escalation system within RT:
1) No update for x minutes/hours, the manager gets informed about it.
2) No answer from the manager, his manager gets informed, etc. until the
Pope gets informed.
And if they (the ones that forget) would try to close the ticket, a
comment is added telling them the problem is still not solved from
Nagios perspective and reopen the ticket if it's the case.
> I would probably write that program to un-acknowledge things as well as
> alarming.
We tough about it. However our customer would start to receive (again)
problem alerts which is bad. I mean, we told him we acknowledged the
problem, we just can't tell him the problem is still going on after 1h. :)
Anyway, I though there was better way to deal with it within Nagios. But
relying on an external ticketing system was probably the best solution
as per your suggestion.
Should we be able to set hostescalation/serviceescalation even if the
problem is acknowledged? But on the other hand, when will it end? :)
Any other ideas or opinions?
--
Mathieu
------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list