Monitoring a process
Hari Sekhon
hpsekhon at googlemail.com
Wed Nov 14 10:25:10 CET 2007
One thing I notice in your configurations, apart from the massive retry
interval is that you are inheriting max check attempts of 15 from the
leadlander template. This means you have to wait 60 minutes for 1st
failure and then 14* 6 hours for the 15th failure on which it goes hard
critical and alerts you.
Change your check intervals and/or max check attempts for this service
in order to make it go into a hard state and then you will get your
notification.
-h
Hari Sekhon
Jerad Riggin wrote:
> Just to clarify
>
> normal_check_interval 60
> retry_check_interval 360
> notification_interval 360
>
> This means that it checks every hour. After it fails the first time,
> it will wait 6 hours to retry, if it is still not OK after 6 hours, it
> notifies and begins to notify every 6 hours until it's ok.
>
> On Nov 13, 2007 3:00 PM, Rich Sasko <rsasko at niag.com> wrote:
>
>> It only notifies after it enters a hard state which is usually after the
>> third try.
>>
>> Richard Sasko
>> Niagara Lasalle Corp
>> Phone: (219) 853-6272
>> Mobile: (219) 484-5617
>> E-mail: rsasko at niag.com
>>
>>
>>
>> -----Original Message-----
>> From: Jerad Riggin [mailto:jriggin at gmail.com]
>> Sent: Tuesday, November 13, 2007 2:54 PM
>> To: Rich Sasko
>> Subject: Re: [Nagios-users] Monitoring a process
>>
>> Thanks for the response. This is what I did just a bit ago. I have
>> it successfully monitoring outlook.exe, however no e-mails are being
>> sent when it's critical.
>>
>> Here is what I have as the service definition:
>>
>> define service{
>> use leadlander
>> host_name leadlandervm
>> service_description Outlook Process
>> contact_groups bo,mis
>> notification_options w,u,c,r
>> check_command check_nt2!PROCSTATE!-d SHOWALL -l
>> outlook.exe
>> }
>>
>> Here is what I have for the leadlander template
>>
>> define service{
>> use generic-service
>> name leadlander
>> is_volatile 0
>> check_period 24x7
>> max_check_attempts 15
>> normal_check_interval 20
>> retry_check_interval 20
>> notification_interval 20
>> notification_period 24x7
>> register 0
>> }
>>
>> Does it only notify after the first retry failure, or should it notify
>> as soon as the service is critical? Any ideas?
>>
>>
>> On Nov 13, 2007 2:39 PM, Rich Sasko <rsasko at niag.com> wrote:
>>
>>> Jerad Riggin <jriggin <at> gmail.com> writes:
>>>
>>>
>>>> Would it be possible using Nsclient++ to monitor for a process name?
>>>> We need to make sure Outlook.exe is running on the server and if it
>>>> isn't, send out notifications.
>>>>
>>>> Thanks,
>>>>
>>>> Jerad
>>>>
>>>>
>>>>
>> ------------------------------------------------------------------------
>> -
>>
>>>> This SF.net email is sponsored by: Splunk Inc.
>>>> Still grepping through log files to find problems? Stop.
>>>> Now Search log events and configuration files using AJAX and a
>>>>
>> browser.
>>
>>>> Download your FREE copy of Splunk now >> http://get.splunk.com/
>>>> _______________________________________________
>>>> Nagios-users mailing list
>>>> Nagios-users <at> lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/nagios-users
>>>> ::: Please include Nagios version, plugin version (-v) and OS when
>>>>
>> reporting
>>
>>> any issue.
>>>
>>>> ::: Messages without supporting info will risk being sent to
>>>>
>> /dev/null
>>
>>>>
>>> Jerad,
>>>
>>> We have the NSClient++ running on our Windows servers and the
>>>
>> following
>>
>>> service check is an example of how we are monitoring services from the
>>>
>> nagios
>>
>>> server:
>>>
>>> define service{
>>> use generic-service
>>> host_name email server
>>> service_description explorer
>>> check_command check_nt!PROCSTATE!-d SHOWALL -l
>>>
>> explorer.exe
>>
>>> }
>>>
>>> It is actually in one of the sample config files, you should just have
>>>
>> to tell
>>
>>> it what process you want to watch.
>>>
>>>
>>>
>>>
>> ------------------------------------------------------------------------
>> -
>>
>>> This SF.net email is sponsored by: Splunk Inc.
>>> Still grepping through log files to find problems? Stop.
>>> Now Search log events and configuration files using AJAX and a
>>>
>> browser.
>>
>>> Download your FREE copy of Splunk now >> http://get.splunk.com/
>>> _______________________________________________
>>> Nagios-users mailing list
>>> Nagios-users at lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/nagios-users
>>> ::: Please include Nagios version, plugin version (-v) and OS when
>>>
>> reporting any issue.
>>
>>> ::: Messages without supporting info will risk being sent to /dev/null
>>>
>>>
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems? Stop.
> Now Search log events and configuration files using AJAX and a browser.
> Download your FREE copy of Splunk now >> http://get.splunk.com/
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
>
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list