Service Configuration Question
Michael Eck
meck at chilitech.net
Wed Mar 24 15:38:53 CET 2004
What an interesting thread.
I have to say that I ultimately agree with Paul Allen's analysis. That
is, a service check is performed to determine the status of a service,
not the status of the delivery mechanism. I think that a state of
UNKNOWN is appropriate for a service check that fails because, for
example, snmpd or the host is down. I appreciate all of your input.
Thanks.
I think we've lost site of my original question. Perhaps I should
provide a bit more background. I'm monitoring a lot of statistics on
wireless bridges. On of these statistics is percentage of
retransmitted frames. Classifying my devices by the thresholds they
surpass helps me prioritize who gets fixed and in what order. The
problem is that when I have all these critical and warning states on my
wireless bridges nagios is also performing host checks, generating
unneeded load on my NMS and unnecessary ICMP traffic on the network.
I'd like to configure Nagios to only perform a host check on state
UNKNOWN for this particular service. It doesn't seem like this is
possible though.
-michael
On Mar 24, 2004, at 7:42 AM, Karl DeBisschop wrote:
> On Wed, 24 Mar 2004 10:33:07 +0100
> joerg.helmert at aracomp.de wrote:
>
>>> -----Original Message-----
>>> From: nagios-users-admin at lists.sourceforge.net
>>> [mailto:nagios-users-admin at lists.sourceforge.net] On Behalf
>>> Of Andreas Ericsson
>>> Sent: Wednesday, March 24, 2004 9:31 AM
>>> To: nagios-users at lists.sourceforge.net
>>> Subject: Re: [Nagios-users] Service Configuration Question
>>>
>>>
>>> Karl DeBisschop wrote:
>>>
>>>> On Tue, 23 Mar 2004 15:39:31 +0100
>>>> Andreas Ericsson <ae at op5.se> wrote:
>>>
>>> --[ snip ]--
>>>
>>>>
>>>>> Again, that's not what I said. But if the plugin can't
>>> fetch any data
>>>>> at ALL (snmpget times out), it's supposed to return
>>> CRITICAL and not
>>>>> UNKNOWN.
>>>>
>>>>
>>>> That sort of depends - if you are checking the status of snmp,
>>>> then yes. But say you are trying to find out if a disk is full,
>>> whether or
>>>> not snmp is running has very little to do with whether the disk is
>>>>
>>>> full
>>>>
>>> I totally disagree. If the plugin fetches disk status input from
>>> nsclient or nrpe (or snmp, for that matter) and can't get it, it's a
>>>
>>> critical error (service not running).
>>> If it fetches it from the LOCAL server and can't get it, then
>>> it's most
>>> likely due to filesystem error (or a plugin bug, which we have to
>>> disregard for the sake of this discussion), which is
>>> definitely critical.
>>>
>> I agree to Andreas.
>> I use monitoring to know everything is ok.
>> If a plugin is not able to do its work, I do not know whats going on.
>> So that is CRITICAL for me.
>>
>> The way the plugin gets its data doesn't play a role for me.
>> (local/snmp/nrpe/nsca/ssh or whatever...)
>>
>> In above scenario you of course do not know if diskstate is ok or not,
>> if snmp fails.
>> But if snmp fails, something is wrong and that is CRITICAL.
>
> Then you want to configure nagios to notify you on UNKNOWN. End of
> story.
>
> If I change the plugin to report CRITICAL, those who use the current
> distinction lose information as a result. For instance, you might very
> reasonably have a nagios admin teams fix or at least diagnose the state
> unknown, whereas a local admin would deal with the disk full. (Though
> I'm not sure how easy this would be in nagios as it stands, ath least
> without some addin mail agent)
>
> OOTH, I can understand the desire for a distinction between a
> plugin syntax error and the UNKNOWN state. I would be happy to code tha
> into the plugins IF Ethan felt he wanted to honor that disctinction in
> Nagios.
>
> --
> Karl
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: IBM Linux Tutorials
> Free Linux tutorial presented by Daniel Robbins, President and CEO of
> GenToo technologies. Learn everything from fundamentals to system
> administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
----
Michael Eck
Chilitech Internet Solutions
Network Operations Center
570-323-2166
http://www.chilitech.net
-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list