No Output
Marc Powell
marc at ena.com
Tue Jun 27 15:31:49 CEST 2006
> -----Original Message-----
> From: Williard, Jason [mailto:Jason.Williard at chartercom.com]
> Sent: Monday, June 26, 2006 5:58 PM
> To: nagios-users at lists.sourceforge.net; Marc Powell
> Subject: RE: [Nagios-users] No Output
>
> > -----Original Message-----
> > From: Williard, Jason [mailto:Jason.Williard at chartercom.com]
> > Sent: Monday, June 26, 2006 5:03 PM
> > To: Marc Powell; nagios-users at lists.sourceforge.net
> > Subject: RE: [Nagios-users] No Output
> >
> > HOSTS.CFG ENTRY
> > ----------------
> > define host{
> > use generic-hos
> > host_name 22xx-WOR-905
> > alias WOR 905 Tunnel 1
> > address 172.29.xxx.xxx
> > parents KWA-Core-7206,MOR-7206
> > check_command check-host-alive
> > max_check_attempts 10
> > notification_interval 60
> > notification_period 24x7
> > notification_options d,u,r
> > contact_groups field-admins
> > }
> >
> > CHECKCOMMANDS.CFG ENTRY
> > ------------------------
>
> Thanks. Looks good.
> >
> > TEST RUN
> > ---------
> > [root at web nagios]# /usr/lib/nagios/plugins/check_ping -H
> 172.29.xxx.xxx
> > -w 3000.0,80% -c 5000.0,100% -p 1
> > CRITICAL - Plugin timed out after 10 seconds
>
> You should always perform your tests as the nagios user. Root can run
> commands that a normal user might not be able to (including ping) and
> you may see different output. In any event, the output above for this
> host looks good. Do you have the same check-host-alive command
specified
> for KWA-Core-7206 and MOR-7206? I am presuming that you can't ping
this
> host because one/both of those are down? How about the same kind of
test
> run for those two hosts. Also, try pinging them directly from the
> command line with /bin/ping -n -U -c 1 172.29.xxx.xxx.
>
> Did you upgrade the plugins when you upgraded nagios? Were there any
> other system upgrades performed at the same time?
>
>
>
> I ran the exact same test as the nagios user and got the same result:
> [root at vuo02web nagios]# su nagios
> sh-3.00$ /usr/lib/nagios/plugins/check_ping -H 172.29.248.75 -w
> 3000.0,80% -c 5000.0,100% -p 1
> CRITICAL - Plugin timed out after 10 seconds
>
>
> As for the KWA-Core-7206 & MOR-7206 sites; these are both UP and
> pingable. I know that the two sites currently showing UNREACHABLE are
> down. We can confirm this by looking at their tunnel status.
However,
> in previous versions of Nagios, it would show the site as DOWN rather
> than UNREACHABLE.
That should still be the case. Nagios determines down v.s. unreachable
by running the host check_command for the parents. If that returns down
then the initial host is unreachable, otherwise the host is just down. I
don't use host checks so I am unable to verify this myself but I haven't
heard of anyone else having the same issue. I've looked through checks.c
and the only place an UNREACHABLE status is assigned is here --
[parent checks removed. Sets value of route_blocked to FALSE if they are
all up]
* if this host has at least one parent host and the route to this host
is blocked, it is unreachable */
if(route_blocked==TRUE && hst->parent_hosts!=NULL)
return_result=HOST_UNREACHABLE;
/* else the parent host is up (or there isn't a parent host), so this
host must be down */
else
return_result=HOST_DOWN;
This is why I believe that the UNREACHABLE status is coming from the
checks of the parents.
> As well, the Status Information would display something like "CRITICAL
-
> Plugin timed out after 10 seconds" rather than "(No output!)". I am
> assuming this is a plugin issue, but, as mentioned before, I am
assuming
> the plugins are working as the same check-host-alive command works for
> sites that are up.
The no output error is interesting and my feeling is that check_ping is
having problems parsing the output of /bin/ping when a host is down.
That's why I was expecting an error when you ran it manually. Try
extending the timeout to something longer to see if you get different
output --
/usr/lib/nagios/plugins/check_ping -H 172.29.xxx.xxx
-w 3000.0,80% -c 5000.0,100% -p 1 -t 30
It definitely shouldn't have taken 10 seconds to send 1 ping to that
host in the first place. Using strace might show something interesting
if you know how to use that application.
> When we did the upgrade, we basically wiped the whole Nagios system
and
> installed the new version. The new latest plugins were installed
along
> with version 2.4 of Nagios. The old cfg files were copied into the
> config directory and modified to fit the new parameters.
Ok. Just on the off chance, have you verified that you have only 1
nagios daemon running?
--
Marc
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list