Nagios just stopped running
Rimbert Rivera
rrivera at comtex.com
Wed Jan 5 05:45:39 CET 2005
I have a cron job that runs the check_nagios plugin and e-mails us the output. Earlier today, we started getting:
"Nagios problem: located 3 processes, status log updated 1565 seconds ago"
Everytime it ran, it was the same output with a longer time that it wasn't updated. This was working fine before where the status log would be updated usually no longer than 8 seconds ago. I checked the status.log and status.sav and confirmed that they hadn't updated. I restarted nagios but I still had the same problem. Even though none of the partitions were running out of space, I deleted archived logs and restarted nagios but same problem. I did some more troubleshooting without any luck. Long story short, I rebooted the RH9 box it was running on and nagios started running again.
Anyone have an idea of what could've happened and things I could check? This is the first time this has ever happened as far as I know. The recent changes we made were just setting up one new host to monitor so we edited hosts.cfg, hostgroups.cfg and services.cfg but nagios restarted without error. I even took out those changes and restarted nagios but still had the same problem. One thing I noticed was our service-perfdata.out is 790 MB. Can I delete this and nagios will create a new one? I'm not sure it's the problem since it's still that big and nagios is running now but it doesn't seem like I want that file to get that big.
What kind of maintenance should I be performing on nagios? We've had it running for over a year and we haven't really did any kind of cleanup on it.
Your help to this newbie would be greatly appreciated.
- Rim
Rimbert Rivera
Manager, Information Technology
COMTEX News Network
rrivera at comtex.com
(703) 820-2000
Discover more about COMTEX at: HYPERLINK http://www.comtex.com/ http://www.comtex.com/
This e-mail is intended solely for the person or entity to which it is addressed and may contain confidential and/or privileged information. Any review, dissemination, copying, printing or other use of this e-mail by persons or entities other than the addressee is prohibited. If you have received this e-mail in error, please contact the sender immediately and delete the material from any computer.
--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.300 / Virus Database: 265.6.8 - Release Date: 1/3/2005
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20050104/4180f7a0/attachment.html>
More information about the Users
mailing list