Simulating downtime in nagios
Andy Shellam
andy-lists at networkmail.eu
Mon Oct 6 22:06:28 CEST 2008
Hi Kelly,
When I've done this in the past, for network services (e.g. http/smtp
checks) I've actually blocked the target port on the Nagios server,
which gives a better simulation that the service is down (e.g. for HTTP
checks, block the Nagios server's outbound port 80.)
This works for us because as well as the router firewalls, each server
runs a local software firewall, so it's easy to block outbound packets
to a particular port on the Nagios server without affecting the service
itself, simulating the effect of a network/service failure.
However when it comes to checks such as disk space, it can be a bit
trickier! I've done things like changing the thresholds for a failure
(e.g. if disk space is currently 15% capacity, I set my warning alert to
be 20%, restart Nagios and wait for the alerts to come, and the same for
critical, then reset back to 90% when complete) and I have done before
as you suggested, change the service's check and retry intervals in
Nagios to be something lengthy (e.g. an hour) then submit a passive
'failure' check result and wait until Nagios re-checks the service -
this method also checks how Nagios alerts you when the service returns
to OK.
Hope this helps, it'd be interesting to hear how/if others do it!
Andy
Kelly Jones wrote:
> What's the best way to simulate (not schedule) downtime in nagios?
>
> I want to "pretend" a service is down for a certain amount of time to
> see what alerts nagios sends, etc.
>
> I've come up w/ two bad ways to do this:
>
> % Edit the config file to change the test to "check_dummy". I want to
> run these "fire drills" via cron, and editing a file and restarting
> nagios seems a little ugly.
>
> % Submit a passive check saying the service is down, and reschedule
> the next check 4 hours later, so the service is 'down' for 4
> hours. This can be done using the nagios named pipe, so it's easy to
> cron. Problem: doing things this way suppresses the alerts (when you
> don't test a service, it doesn't send an alert).
>
> Thoughts?
>
>
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list