Strategies for coping with self-DOS?
Chris Beattie
cbeattie at geninfo.com
Thu Aug 23 16:42:33 CEST 2012
Where I work, the server engineers want Nagios to notify them fairly
quickly when a problem develops. During the day, the settings are fine.
Recently, however, the nightly backups and scheduled antivirus scans
began causing enough load that monitored hosts to become briefly
unavailable, but still long enough that Nagios sends notifications that
make it to their pagers.
What are some of the strategies you use to deal with this?
The last time I dealt with this, I had two service template files, which
specified different max_check_attempts and retry_intervals for day and
night. I used a cron job to copy the appropriate template file to a
name Nagios was configured to load, and restart Nagios.
As we upgraded things, the problem went away, so I ditched that setup.
It always seemed like a kludge. Scheduled reboots just smell like
failure to me and they don't scale well if you have multiple thousands
of hosts and services. Well, our server estate has continued to expand
and now we're back to committing own-goals with the midnight pages.
This time, I'm thinking about defining escalations with different
timeperiods, but I'm curious to find out what other approaches have been
successful.
Thanks!
--
-Chris
Nothing in this message is intended to make or accept an offer or to form a contract, except that an attachment that is an image of a contract bearing the signature of an officer of our company may be or become a contract. This message (including any attachments) is intended only for the use of the individual or entity to whom it is addressed. It may contain information that is non-public, proprietary, privileged, confidential, and exempt from disclosure under applicable law or may constitute as attorney work product. If you are not the intended recipient, we hereby notify you that any use, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this message in error, please notify us immediately by telephone and delete this message immediately.
Thank you.
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list