<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv=Content-Type content="text/html; charset=us-ascii">
<meta name=Generator content="Microsoft Word 12 (filtered medium)">
<style>
<!--
/* Font Definitions */
@font-face
{font-family:SimSun;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:SimSun;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:"\@SimSun";
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:Consolas;
panose-1:2 11 6 9 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman","serif";
color:black;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
pre
{mso-style-priority:99;
mso-style-link:"HTML Preformatted Char";
margin:0cm;
margin-bottom:.0001pt;
font-size:10.0pt;
font-family:"Courier New";
color:black;}
tt
{mso-style-priority:99;
font-family:"Courier New";}
span.HTMLPreformattedChar
{mso-style-name:"HTML Preformatted Char";
mso-style-priority:99;
mso-style-link:"HTML Preformatted";
font-family:Consolas;
color:black;}
span.EmailStyle20
{mso-style-type:personal;
font-family:"Calibri","sans-serif";
color:#1F497D;}
span.EmailStyle21
{mso-style-type:personal-reply;
font-family:"Calibri","sans-serif";
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page Section1
{size:612.0pt 792.0pt;
margin:70.85pt 2.0cm 2.0cm 2.0cm;}
div.Section1
{page:Section1;}
-->
</style>
<!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body bgcolor=white lang=IT link=blue vlink=purple>
<div class=Section1>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>Hello<o:p></o:p></span></p>
<p class=MsoNormal style='margin-left:35.4pt'><tt><span lang=EN-US
style='font-size:10.0pt'>The process of starting ndo2db and then Nagios makes
sure that there is actual data within the DB. </span></tt><tt><span
style='font-size:10.0pt'>If there is an outdated data within the DB it needs to
be removed before Nagios even sends new data. So the process of trimming those
table entries is truly intentional at the beginning (so-called pre-launch state
where the if condition matches). If ndo2db fails for some reason, those data
will remain within the database and then removed during the next start. <o:p></o:p></span></tt></p>
<p class=MsoNormal><tt><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'> </span></tt><tt><span
lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>Ok, when the parent ndo2db process starts. But not when the
child starts.<o:p></o:p></span></tt></p>
<p class=MsoNormal style='margin-left:35.4pt'><span lang=EN-US
style='font-size:10.0pt;font-family:"Courier New"'><br>
<tt>Depending on your startup routine I would guess that you started Nagios
first and then ndo2db. </tt></span><tt><span style='font-size:10.0pt'>But it
shouldn't because ndomod as an event broker keeps data not written to ndo2db in
a defined cache. Depending on your configuration this cache may be to little so
the oldest entry could be lost (in this case the programstatus of Nagios). But
that's really a guess you'll have to give more information where and when this
error occurs mentioning all circumstances you'll catch up in the logs (turn on
very detailed and everything in debug_level in case).</span></tt> <span
style='color:#1F497D'><o:p></o:p></span></p>
<p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'> </span><span
lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>Ndo2db starts first. In the case ndomod can’t write to
ndo2db it logs the condition, and this is not the case. Every log are at
maximum detail.<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'><o:p> </o:p></span></p>
<p class=MsoNormal style='margin-left:35.4pt'><tt><span lang=EN-US
style='font-size:10.0pt'>Which code and which configuration?</span></tt><span
lang=EN-US><br>
</span><tt><span style='font-size:10.0pt'>The only thing I can see here is
tstamp.tv_sec which is a converted timestamp got from eventbroker module. This
is kind of now() but recently a now() from Nagios itsself. You may check
ndo2db.c;ndo2db_convert_standard_data_elements</span></tt><br>
<tt><span style='font-size:10.0pt'>The other compared value is
dbinfo.latest_realtime_data_time which is initialized in db.c:ndo2db_db_init
and then updated if dbinfo.latest_program_status_time newer (db.c:374; directly
to that value). There are several other realtime datavalues which may update
this value.</span></tt><br>
<tt><span style='font-size:10.0pt'>So the clue of this data is - if actual
Nagios NDO_DATA_TIMESTAMP is newer than the latest realtime data gotten some
time before, it is time for a cleanup at the very beginning of ndo2db (check
the sequence in ido2db.c:main).</span></tt> <o:p></o:p></p>
<p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'> </span><span
lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>The code and configuration of visualization software as nagvis.<o:p></o:p></span></p>
<p class=MsoNormal style='margin-left:35.4pt'><span lang=EN-US
style='font-size:10.0pt;font-family:"Courier New"'><br>
</span><tt><span lang=EN-US style='font-size:10.0pt;color:#1F497D'>T</span></tt><tt><span
lang=EN-US style='font-size:10.0pt'>he conditional statement does not only
insist on the difference 0 or more but also if it is a process pre launch (see
above). </span></tt><tt><span style='font-size:10.0pt'>But besides a question -
how did you get to this values? Current NDOUtils code doesn't give and debug
information at this stage.</span></tt> <o:p></o:p></p>
<p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'> </span><span
lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>I am referring to the code in visualization software, not ndo. <o:p></o:p></span></p>
<p class=MsoNormal style='margin-left:35.4pt'><tt><span lang=EN-US
style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'> </span></tt><tt><span
lang=EN-US style='font-size:10.0pt'> </span></tt><span lang=EN-US
style='font-size:10.0pt;font-family:"Courier New"'><br>
<tt>Seeing your ndo2db die and refork explains why the pre_launch_state and
timestamp condition is matching and so within each period of time, database
cleanup is performed.</tt></span><span lang=EN-US><br>
</span><tt><span style='font-size:10.0pt'>It would be interesting why ndo2db is
dying. Depending on your configuration this may vary - tcp or unix socket e.g.?
What about more detailed debuglogs or are there messages like "error
writing to datasink" in the logs?</span></tt><tt><span style='font-size:
10.0pt;color:#1F497D'><o:p></o:p></span></tt></p>
<p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'> </span><span
lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>The clue here is that ndo2db doesn’t seem to have
died instead it exit gracefully, as you can see from the strace output below.<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'> The
other strange thing is that ndo2db refork exactly every 60 seconds. Where in
the code I can find this interval? Why the child ndo2db exits?<o:p></o:p></span></p>
<p class=MsoNormal style='text-indent:35.4pt'><span lang=EN-US
style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'> [root@itsss214
var]# ps -ef |grep ndo2<o:p></o:p></span></p>
<p class=MsoNormal style='text-indent:35.4pt'><span lang=EN-US
style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>nagios
24831 1 0 09:53
? 00:00:00 /opt/nagios/bin/ndo2db -c
/opt/nagios/etc/ndo2db.cfg<o:p></o:p></span></p>
<p class=MsoNormal style='text-indent:35.4pt'><span lang=EN-US
style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>nagios
24957 24831 1 14:51 ? 00:00:00
/opt/nagios/bin/ndo2db -c /opt/nagios/etc/ndo2db.cfg<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'> Note
the child startup time, 14:51, parent started at 9:53<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'><o:p> </o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>time(NULL)
= 1253276162<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>gettimeofday({1253276162, 271627}, NULL) = 0<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>write(3, "[1253276162.271627] [002.0] [pid"..., 272) =
272<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>lseek(3, 0,
SEEK_CUR)
= 53846244<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>poll([{fd=7, events=POLLIN|POLLPRI}], 1, 0) = 0<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>write(7, "\350\0\0\0\3INSERT INTO nagios_processe"...,
236) = 236<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>read(7, "\t\0\0\1\0\1\374\371\341\2\0\0\0", 16384) =
13<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>time([1253276162])
= 1253276162<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>time(NULL)
= 1253276162<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>gettimeofday({1253276162, 272624}, NULL) = 0<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>write(3, "[1253276162.272624] [002.0] [pid"..., 272) =
272<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>lseek(3, 0,
SEEK_CUR)
= 53846516<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>poll([{fd=7, events=POLLIN|POLLPRI}], 1, 0) = 0<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>write(7, "\350\0\0\0\3INSERT INTO nagios_processe"...,
236) = 236<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>read(7, "\t\0\0\1\0\1\374\372\341\2\0\0\0", 16384) =
13<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>gettimeofday({1253276162, 272626}, NULL) = 0<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>write(3, "[1253276162.272626] [002.0] [pid"..., 163) =
163<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>lseek(3, 0,
SEEK_CUR)
= 53846679<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>poll([{fd=7, events=POLLIN|POLLPRI}], 1, 0) = 0<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>write(7, "{\0\0\0\3UPDATE nagios_programstatus"...,
127) = 127<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>read(7, "0\0\0\1\0\1\0\2\0\0\0(Rows matched: 1
Cha"..., 16384) = 52<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>time([1253276162])
= 1253276162<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>time(NULL)
= 1253276162<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>time([1253276162])
= 1253276162<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>read(6, "\n219:\n1=1603\n2=0\n3=0\n4=125327616"..., 511)
= 46<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>time(NULL)
= 1253276162<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>time([1253276162])
= 1253276162<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>read(6, "\n1000\nENDTIME: 1253276162\nGOODBY"..., 511)
= 35<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>read(6, "",
511)
= 0<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>gettimeofday({1253276162, 312611}, NULL) = 0<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>write(3, "[1253276162.312611] [002.0] [pid"..., 258) =
258<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>lseek(3, 0,
SEEK_CUR)
= 53846937<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>poll([{fd=7, events=POLLIN|POLLPRI}], 1, 0) = 0<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>write(7, "\332\0\0\0\3UPDATE nagios_conninfo SET "...,
222) = 222<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>read(7, "0\0\0\1\0\1\0\2\0\0\0(Rows matched: 1
Cha"..., 16384) = 52<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>write(7, "\1\0\0\0\1",
5)
= 5<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>shutdown(7, 2 /* send and receive */) = 0<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>close(7)
= 0<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>time([1253276162])
= 1253276162<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>stat("/etc/localtime", {st_mode=S_IFREG|0644,
st_size=951, ...}) = 0<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>stat("/etc/localtime", {st_mode=S_IFREG|0644,
st_size=951, ...}) = 0<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>stat("/etc/localtime", {st_mode=S_IFREG|0644,
st_size=951, ...}) = 0<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>sendto(8, "<14>Sep 18 14:16:02 ndo2db: Succ"...,
73, MSG_NOSIGNAL, NULL, 0) = 73<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>close(6)
= 0<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>close(3)
= 0<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>munmap(0x2a95557000,
4096)
= 0<o:p></o:p></span></p>
<p class=MsoNormal><span lang=EN-US style='font-size:11.0pt;font-family:"Calibri","sans-serif";
color:#1F497D'>exit_group(0)
= ?<o:p></o:p></span></p>
</div>
</body>
</html>