Nagios 2.0a1 SIGSEGV problem solved - bug still present?
Tom DE BLENDE (GCC)
Tom.DeBlende at dhl.com
Mon Sep 6 12:13:52 CEST 2004
Dear,
Thanks to the help I received from Mr. Hopcroft, I managed to solve the
SIGSEGV problems I had with Nagios 2.0a1. Here is the output I had from gdb:
[root at gcclo77 etc]# gdb ../bin/nagios
GNU gdb Red Hat Linux (6.0post-0.20040223.20rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...Using host
libthread_db library "/lib/tls/libthread_db.so.1".
(gdb) set args /usr/local/nagios/etc/nagios.cfg
(gdb) r
Starting program: /usr/local/nagios/bin/nagios
/usr/local/nagios/etc/nagios.cfg
[Thread debugging using libthread_db enabled]
[New Thread -1220216704 (LWP 22830)]
Nagios 2.0a1
Copyright (c) 1999-2004 Ethan Galstad (nagios at nagios.org)
Last Modified: 11-18-2003
License: GPL
Nagios 2.0a1 starting... (PID=22830)
[New Thread -1220326480 (LWP 22839)]
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1220216704 (LWP 22830)]
0xb747ea6a in __strtol_internal () from /lib/tls/libc.so.6
(gdb) list
231 {"version",no_argument,0,'V'},
232 {"license",no_argument,0,'V'},
233 {"verify",no_argument,0,'v'},
234 {"daemon",no_argument,0,'d'},
235 {0,0,0,0}
236 };
237 #endif
238
239 /* make sure we have the correct number of command line
arguments */
240 if(argc<2)
(gdb) info stack
#0 0xb747ea6a in __strtol_internal () from /lib/tls/libc.so.6
#1 0x0806c49a in xrddefault_read_state_information
(main_config_file=0x809b008 "/usr/local/nagios/etc/nagios.cfg") at
stdlib.h:317
#2 0x08069d37 in read_initial_state_information
(main_config_file=0x809b008 "/usr/local/nagios/etc/nagios.cfg") at
sretention.c:99
#3 0x08052413 in main (argc=134852616, argv=0x809b008) at nagios.c:614
(gdb) bt full
#0 0xb747ea6a in __strtol_internal () from /lib/tls/libc.so.6
No symbol table info available.
#1 0x0806c49a in xrddefault_read_state_information
(main_config_file=0x809b008 "/usr/local/nagios/etc/nagios.cfg") at
stdlib.h:317
temp_buffer =
"state_history\0000\0000\0000\0000\0000\0000\0000\0000\0000\0000\0000\0000\0000\0000\0000\0000\0000\0000\0000\000\000\0000\000\000\000.170000
ms)\000\000\000he published applications \"Microsoft Word 2000,Microsoft
Excel 2000,Microsoft PowerPoint 2000,Microsoft Access 2000,Desktop,Inte"...
temp_buffer2 = "\b°\t\b", '\0' <repeats 16 times>,
"\001\000\000\000[F_·>º^·\027AF·_sta_ at F·P\000\000\000Lº^·t·^·\004³^·ì\001`·\t\000\000\000¨\237]·\020\005`·|\233]·þ)F·øÄÿ¿æ5_·\000\000\000\000[F_·>º^·,\017R·\000\000\000\000\000\000\000\000\230xX·\230xX·\000\000\000\000¤Êÿ¿«sI·\230Æÿ¿\000\000\000\000\001",
'\0' <repeats 23 times>,
"\220Êÿ¿ÜI·XÆÿ¿´\201\000\000\001\000\000\000\000\000\000\000HÅÿ¿\002\000\000\000\001\000\000\000ÌÊÿ¿xÉÿ¿\000\000\000\000m\t\t"...
temp_ptr = 0x0
fp = (FILE *) 0x80a4358
host_name = 0x809b290 "frangocopy"
service_description = 0x809c538 "Backup"
data_type = 4
x = 19
temp_host = (host *) 0x0
temp_service = (service *) 0x82da188
temp_command = (command *) 0xbfffc800
var = 0xbfffc800 "state_history"
val = 0xbfffc80e "0"
current_time = 1094462915
scheduling_info_is_ok = 0
#2 0x08069d37 in read_initial_state_information
(main_config_file=0x809b008 "/usr/local/nagios/etc/nagios.cfg") at
sretention.c:99
No locals.
#3 0x08052413 in main (argc=134852616, argv=0x809b008) at nagios.c:614
result = 134852616
error = 0
buffer = "Nagios 2.0a1 starting...
(PID=22830)\000\000\000\000Ñ\233_·`øD·\000\000\000\000
\000\000\000\002\000\000\000ðõD·ÔüD·u._·\000
]·MÐ\000\000\020\005`·|Îÿ¿\017Ì^·ì\001`·\224\b`·\000\000\000\000\000\000\000\000)Ú^·\000
^·\021", '\0' <repeats 11 times>, "p\r`·", '\0' <repeats 32 times>,
"àÍÿ¿\000\000\000\000\000\000\000\000OÚ_·<Ù_·>Ù_·ø\006`·\000\000\000\000ì\001`·¦\022Q\001=\v\003",
'\0' <repeats 25 times>, "ø\006`·", '\0' <repeats 44 times>...
display_license = 0
display_help = 0
c = -1219031296
option_index = 0
long_options = {{name = 0x8087a00 "help", has_arg = 0, flag =
0x0, val = 104}, {name = 0x8088a90 "version", has_arg = 0,
flag = 0x0, val = 86}, {name = 0x8087a05 "license", has_arg = 0, flag
= 0x0, val = 86}, {name = 0x8087a0d "verify",
has_arg = 0, flag = 0x0, val = 118}, {name = 0x8087a14 "daemon",
has_arg = 0, flag = 0x0, val = 100}, {name = 0x0,
has_arg = 0, flag = 0x0, val = 0}}
(gdb)
This lead Mr. Hopcroft to believe that there might have been a problem
with the service definition of the Backup service on Frangocopy. I went
to look into the retention.dat file, and noticed the entry for that
service wasn't properly formatted. The trailing } wasn't present. I
tried adding it manually, but that didn't work. Only after removing the
entry entirely from the file, I could startup Nagios again.
I don't know how the retention file got messed up like that. But it
probably shouldn't cause a seg fault, so that's why I'm posting this.
Now people more skilled than me can look into it, and make changes to
the code where appropriate.
Kind regards,
Tom
--
Tom De Blende
Senior Infrastructure Analyst
DHL European Coordination Center - IT Department
Tel +32 2 713 42 62
Fax +32 2 713 52 00
-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_idP47&alloc_id808&op=click
More information about the Developers
mailing list