service alert aggregation?
Joshua Barratt
jbarratt at serialized.net
Tue Sep 30 07:27:45 CEST 2003
>I would argue you *do* want to be notified about situations where your
>servers are failing multiple service checks. After all, isn't that the
>point of monitoring? Moreover, you want to be notified when these
>services reach a warning state so things don't reach the critical point.
Ah. I'm afraid I've been unclear. I very much do want to know about
cases in which I have multiple services failing!
However, assuming I have a sudden-onset catastrophic failure, such as
what happened last night (the system started swapping at an insane
rate), this is the alert sequence that was generated:
4:09 AM "HTTP is CRITICAL"
4:09 AM "IMAP is CRITICAL"
4:10 AM "FTP is CRITICAL"
4:12 AM "SMTP is CRITICAL"
...
and then the 4 corresponding "... is OK" pages.
I'm fairly certain that, in this case, the services all went critical at
about the same time; it's just that the way the checks were scheduled,
nagios wasn't sure (3/3) until :10 and :12 that FTP and SMTP were
actually down.
What I would much rather have is 2 pages instead of 8:
4:10 AM "HTTP,IMAP,FTP,SMTP are CRITICAL"
...
4:21 AM "HTTP,IMAP,FTP,SMTP are OK"
So if a script simply trapped the first alert that would have been
generated (HTTP is CRITICAL) and, because of that, scheduled service
checks for "now" on that host, then waited (say 30 seconds) for any
further alerts to come through for that host, an alert like the above
could be created, instead of the flurry we otherwise have been getting.
My goal is not to reduce the amount of information flowing to the
admins, just to turn the volume down. Legitimate pages need to get
through, and as soon as the problem is known about, but the fewer the
better!
Sorry for my initial lack of clarity, and thanks for the response...
Josh
-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list