Issues with 1.0b5
Brian Wilson
wilson at ncsu.edu
Thu Aug 22 20:15:31 CEST 2002
Yes, that's what I'm currently doing with netsaint.. and inserting
and removing commands from the external commands file.. I just thought
that this might have been someones wishlist for awhile and was hoping to
see it in nagios..
On Thu, 22 Aug 2002, Rusch, Daniel wrote:
> Brian,
>
> As to "bulk managing" groups of hosts I understand your point that it would
> be nice if this was do able out of the box. Have you checked out the
> external commands, you can write a simple script to do what you want.
>
> -----Original Message-----
> From: Brian Wilson [mailto:wilson at unity.ncsu.edu]
> Sent: Thursday, August 22, 2002 9:22 AM
> To: nagios-users at lists.sourceforge.net
> Subject: [Nagios-users] Issues with 1.0b5
>
>
>
> First off, I've been a longtime user of netsaint and my current network
> status system is based off of a combination of netsaint and hp openview.
> After learning about the nagios/netsaint spinoff, I decided to look into
> it because of some critical features that were missing from netsaint:
>
> First being the fact that nagios would retain host downtime on program
> restarts. This is critical and I got around this in netsaint with a
> series of AT jobs. I also like the config file changes.. I have scripts
> that do all configuration file generation, and having this change was well
> needed ( There is, however, a limitation as to how many devices you can
> add to a group.. try adding 2,000 hosts to a group and see what happens..
> group members should really be 1 per line instead of comma separated
> member switch1
> member switch2
> member switch3
> etc..
> ).
>
> I've been convinced to do a re-write of our current system and decided to
> try nagios instead of netsaint, mainly because of downtime retention. To
> test downtime retention, I setup a test device with a dummy host-alive
> check and 2 service checks, one being a ping:
>
> define service{
> use generic-service;
> host_name cisco-temp;
> is_volatile 0;
> check_period 24x7;
> contact_groups dummy-group;
> notification_period 24x7;
> notification_interval 180;
> notification_options c,r;
> service_description PING;
> max_check_attempts 5;
> normal_check_interval 8;
> retry_check_interval 2;
> event_handler switch-down-event-handler;
> check_command check_ping;
> }
>
> I setup a dummy-group to send email to ( because I don't want
> notifications of a host/service down being sent directly to an email
> address.. imagine if 100 hosts suddenly went down.. 100 emails )..
> instead, I always call an event handler to perform some action ( ie. allow
> host notifications to queue up and send them in bulk every 15 minutes or
> so. )
>
> 1st problem: I then add a host downtime entry for this device, unhook it
> from the network so the ping service check will fail, and I continue to
> get notifications from the event handler for that service. (so, am I
> correct in assuming that just because you set a host downtime entry for a
> device that the service checks will keep sending notifications? Why is
> this? If a host is down, then services will be down)
>
> 2nd problem: I then add a service downtime entry for this device, unhook
> it from the network again so that the ping service check will fail, and I
> still continue to get notifications from the event handler for that
> service. (so, am I correct in assuming that downtime for a service only
> affects email notifications and not event handler notifications?)
>
> 3rd problem: One problem with netsaint, which I still see in nagios, is
> the lack of a tool to bulk manage a number of devices. (ie, if I know a
> building is going to loose power from 08:00 to 14:00, then I want to
> schedule downtime for all devices in that building. With the current
> process of setting downtime, this is rather tedius. I'll probably get
> around it by writing my own interface to do this (as I did with netsaint),
> but this would be a great feature to add to your wishlist.
>
> Question: assuming downtime scheduling works correctly, would it be
> possible to put downtime data into mysql and point 2 different nagios
> servers at the database (the servers would be monitoring the same devices,
> so, essentially, they would have the same downtime schedules.
>
> Thanks for listening.. I think nagios is a step in the right direction as
> far as network/host monitoring goes.
>
> Brian
>
> --
> Brian Wilson <wilson at ncsu.edu> Network Analyst
> Communication Technologies, ATD W: 919.513.3472
> North Carolina State University www.ncstate.net
>
>
>
> -------------------------------------------------------
> This sf.net email is sponsored by: OSDN - Tired of that same old
> cell phone? Get a new here for FREE!
> https://www.inphonic.com/r.asp?r=sourceforge1&refcode1=vs3390
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
>
--
Brian Wilson <wilson at ncsu.edu> Network Analyst
Communication Technologies, ATD W: 919.513.3472
North Carolina State University www.ncstate.net
-------------------------------------------------------
This sf.net email is sponsored by: OSDN - Tired of that same old
cell phone? Get a new here for FREE!
https://www.inphonic.com/r.asp?r=sourceforge1&refcode1=vs3390
More information about the Users
mailing list