Freshness Checks
Joseph L. Casale
JCasale at activenetwerx.com
Wed Jul 15 17:46:23 CEST 2009
I setup a check for a few very large backup jobs. This works as expected.
If I feel like looking, I see a WARNING State when the nsca client starts
the job at 19:00, the freshness check at 19:10 passes if the job actually
did start. No notifications go out...
I then schedule a check at 23:30 to look for an updated status which would
be SUCCESSFUL if the backup job finished or CRITICAL if it knows it failed,
or if it tanks and doesn't even update, the freshness interval is just short
of the check so it goes CRITICAL either way. Notifications go out if anything
went wrong.
That's good for big jobs, but I now want to add some tiny jobs that rsync
a very small set of files, problem is they finish so fast, my routine below
can't hack it. I don't want to setup multiple defs and checks. I know I can
wait till the next day for example, but that isn't good enough for some jobs.
Is there some better way to accomplish this?
Thanks!
jlc
timeperiods.cfg
# Set this to start JUST after the Backup Starts at 19:00, as it acknowledges
# the updated state of the freshness of the backup Passive Check.
# Stop monitoring shortly after! The freshness interval is short, we need to know if the backup fails
define timeperiod{
timeperiod_name Backup_Window
alias Weekday Backup Window
monday 19:10-19:15 ; Freshness should be "fresh" as backup starts at 19:00.
monday 23:30-24:00 ; Freshness interval MUST be shorter than time between these two periods (6:15).
tuesday 19:10-19:15
tuesday 23:30-24:00
wednesday 19:10-19:15
wednesday 23:30-24:00
thursday 19:10-19:15
thursday 23:30-24:00
friday 19:10-19:15
friday 23:30-24:00
}
templates.cfg
# Backup Job Passive Service definition
define service{
name passive-service
active_checks_enabled 0
passive_checks_enabled 1
flap_detection_enabled 0
register 0
is_volatile 0
check_period Backup_Window
max_check_attempts 1
normal_check_interval 5
retry_check_interval 1
check_freshness 1
freshness_threshold 21600 ; 6:00, All backups should be done by now. Correlate this with the Backup_Window Time Definition.
contact_groups admins
notification_interval 90000
notification_period 24x7
notification_options u,c
}
Object.cfg
define service{
use passive-service
host_name host.domain.com
service_description Daily Backup - BackupServer1
check_command check_dummy!2
}
------------------------------------------------------------------------------
Enter the BlackBerry Developer Challenge
This is your chance to win up to $100,000 in prizes! For a limited time,
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize
details at: http://p.sf.net/sfu/Challenge
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list