ios (input output service) chain crashing (Linux)

Added by cjlpa over 6 years ago

Hello,

I have compiled the server for Linux (Debian 6.0) using the guide in the wiki, also with with_static_drivers=on as suggested when experiencing chain crashes.
However, the Input Output Service is chain crashing (it lasts for about 2-3 minutes), producing gigabytes of logs on its way. I didn't find any ERR in the log though, only INF an WRN entries.

Does anyone have experience with this behaviour? Also, where can I start looking for the error if there is none in the log file?

Thanks a lot,
F.


Replies (17)

RE: ios (input output service) chain crashing (Linux) - Added by sfb over 6 years ago

I've had the same problem on my shard. I find it odd that my Ring shard is fine but my standard shard crashes.

Can you do me a favor and check your IOS logs and let me know if you see something like this:

2012/05/09 17:03:00 INF 3071096528 213.166.170.12/IOS-130 common.cpp 567 Exception : Exception will be launched: stream does not contain at least 1095321667 bytes for check

It would be the last line before the IOS service starts back up again.

Thanks,
sfb
/s

RE: ios (input output service) chain crashing (Linux) - Added by cjlpa over 6 years ago

$ cat input_output_service.log | grep "Exception"

2012/05/10 18:59:30 INF 1446532896 127.0.1.1/IOS-128 config_file.h 326 EUnknownVar : CF: Exception will be launched: variable "MaxDistSay" not found in file "/home/florian/ryzom/ryzom/code/ryzom/server/input_output_service.cfg"
2012/05/10 18:59:30 INF 1446532896 127.0.1.1/IOS-128 config_file.h 326 EUnknownVar : CF: Exception will be launched: variable "ReadWorkOnly" not found in file "/home/florian/ryzom/ryzom/code/ryzom/server/input_output_service.cfg"
2012/05/10 19:01:56 INF 1446532896 127.0.1.1/IOS-128 common.cpp 579 Exception : Exception will be launched: stream does not contain at least 1095321667 bytes for check
2012/05/10 19:02:00 INF 1890117408 127.0.1.1/IOS-128 config_file.h 326 EUnknownVar : CF: Exception will be launched: variable "MaxDistSay" not found in file "/home/florian/ryzom/ryzom/code/ryzom/server/input_output_service.cfg"
2012/05/10 19:02:00 INF 1890117408 127.0.1.1/IOS-128 config_file.h 326 EUnknownVar : CF: Exception will be launched: variable "ReadWorkOnly" not found in file "/home/florian/ryzom/ryzom/code/ryzom/server/input_output_service.cfg"
2012/05/10 19:04:26 INF 1890117408 127.0.1.1/IOS-128 common.cpp 579 Exception : Exception will be launched: stream does not contain at least 1095321667 bytes for check
2012/05/10 19:04:30 INF 3356026656 127.0.1.1/IOS-128 config_file.h 326 EUnknownVar : CF: Exception will be launched: variable "MaxDistSay" not found in file "/home/florian/ryzom/ryzom/code/ryzom/server/input_output_service.cfg"
2012/05/10 19:04:30 INF 3356026656 127.0.1.1/IOS-128 config_file.h 326 EUnknownVar : CF: Exception will be launched: variable "ReadWorkOnly" not found in file "/home/florian/ryzom/ryzom/code/ryzom/server/input_output_service.cfg"
2012/05/10 19:06:57 INF 3356026656 127.0.1.1/IOS-128 common.cpp 579 Exception : Exception will be launched: stream does not contain at least 1095321667 bytes for check
2012/05/10 19:07:01 INF 1894926112 127.0.1.1/IOS-128 config_file.h 326 EUnknownVar : CF: Exception will be launched: variable "MaxDistSay" not found in file "/home/florian/ryzom/ryzom/code/ryzom/server/input_output_service.cfg"
2012/05/10 19:07:01 INF 1894926112 127.0.1.1/IOS-128 config_file.h 326 EUnknownVar : CF: Exception will be launched: variable "ReadWorkOnly" not found in file "/home/florian/ryzom/ryzom/code/ryzom/server/input_output_service.cfg"
2012/05/10 19:09:27 INF 1894926112 127.0.1.1/IOS-128 common.cpp 579 Exception : Exception will be launched: stream does not contain at least 1095321667 bytes for check
2012/05/10 19:09:31 INF 322643744 127.0.1.1/IOS-128 config_file.h 326 EUnknownVar : CF: Exception will be launched: variable "MaxDistSay" not found in file "/home/florian/ryzom/ryzom/code/ryzom/server/input_output_service.cfg"
2012/05/10 19:09:31 INF 322643744 127.0.1.1/IOS-128 config_file.h 326 EUnknownVar : CF: Exception will be launched: variable "ReadWorkOnly" not found in file "/home/florian/ryzom/ryzom/code/ryzom/server/input_output_service.cfg"
2012/05/10 19:11:57 INF 322643744 127.0.1.1/IOS-128 common.cpp 579 Exception : Exception will be launched: stream does not contain at least 1095321667 bytes for check
2012/05/10 19:12:01 INF 640333600 127.0.1.1/IOS-128 config_file.h 326 EUnknownVar : CF: Exception will be launched: variable "MaxDistSay" not found in file "/home/florian/ryzom/ryzom/code/ryzom/server/input_output_service.cfg"
2012/05/10 19:12:01 INF 640333600 127.0.1.1/IOS-128 config_file.h 326 EUnknownVar : CF: Exception will be launched: variable "ReadWorkOnly" not found in file "/home/florian/ryzom/ryzom/code/ryzom/server/input_output_service.cfg"
2012/05/10 19:14:28 INF 640333600 127.0.1.1/IOS-128 common.cpp 579 Exception : Exception will be launched: stream does not contain at least 1095321667 bytes for check
2012/05/10 19:14:31 INF 3001235232 127.0.1.1/IOS-128 config_file.h 326 EUnknownVar : CF: Exception will be launched: variable "MaxDistSay" not found in file "/home/florian/ryzom/ryzom/code/ryzom/server/input_output_service.cfg"
2012/05/10 19:14:31 INF 3001235232 127.0.1.1/IOS-128 config_file.h 326 EUnknownVar : CF: Exception will be launched: variable "ReadWorkOnly" not found in file "/home/florian/ryzom/ryzom/code/ryzom/server/input_output_service.cfg"

RE: ios (input output service) chain crashing (Linux) - Added by sfb over 6 years ago

Can you stop the shard, delete the logs, start the shard and then send a copy of the input_output_service.log file after it crashes?

Thanks,
sfb
/s

RE: ios (input output service) chain crashing (Linux) - Added by molator over 6 years ago

Is that a 64bit distro ?
If it's the case, try using WITH_STATIC=ON and WITH_STATIC_DRIVERS=ON

RE: ios (input output service) chain crashing (Linux) - Added by sfb over 6 years ago

Already done in my case, Molator.

RE: ios (input output service) chain crashing (Linux) - Added by sfb over 6 years ago

One thing that would help would be a backtrace. It's a little tricky - you'll have to do like ps -auwwwx | grep input_output_service and then you should see two processes. One that is service_launcher.sh and one that is the actual IOS service. You will need to kill both - as the service launcher will ensure that IOS starts back up again. Then change to the code/ryzom/server folder and run gdb src/input_output_service. Once in gdb you should type run -C. -L. --nobreak --writepid to run it in GDB.

My example:

$ gdb src/input_output_service
GNU gdb (Ubuntu/Linaro 7.3-0ubuntu2) 7.3-2011.08
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying" 
and "show warranty" for details.
This GDB was configured as "i686-linux-gnu".
For bug reporting instructions, please see:
<http://bugs.launchpad.net/gdb-linaro/>...
Reading symbols from /home/mattr/sandbox/ryzom/code/ryzom/server/src/input_output_service/input_output_service...
(gdb) run -C. -L. --nobreak --writepid
[...snip...]
INF b79b06d0 common.cpp 567 Exception 213.166.170.12/IOS-130 : Exception will be launched: stream does not contain at least 1095321667 bytes for check
terminate called after throwing an instance of 'NLMISC::EStreamOverflow'
  what():  stream does not contain at least 1095321667 bytes for check

Program received signal SIGABRT, Aborted.
0xb7fdf424 in __kernel_vsyscall ()
(gdb) bt
#0  0xb7fdf424 in __kernel_vsyscall ()
#1  0xb79fdc8f in raise () from /lib/i386-linux-gnu/libc.so.6
#2  0xb7a012b5 in abort () from /lib/i386-linux-gnu/libc.so.6
#3  0xb7c2a4ed in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/i386-linux-gnu/libstdc++.so.6
#4  0xb7c28283 in ?? () from /usr/lib/i386-linux-gnu/libstdc++.so.6
#5  0xb7c282bf in std::terminate() () from /usr/lib/i386-linux-gnu/libstdc++.so.6
#6  0xb7c2840e in __cxa_throw () from /usr/lib/i386-linux-gnu/libstdc++.so.6
#7  0x0822f2c6 in checkStreamSize (numBytes=<optimized out>, this=<optimized out>) at /home/mattr/sandbox/ryzom/code/nel/include/nel/misc/stream.h:948
#8  NLMISC::CMemStream::serial (this=<optimized out>, b=...) at /home/mattr/sandbox/ryzom/code/nel/include/nel/misc/mem_stream.h:879
#9  0x082324b9 in NLMISC::CMemStream::serial (this=0xbfffef28, b=...) at /home/mattr/sandbox/ryzom/code/nel/include/nel/misc/mem_stream.h:875
#10 0x08145109 in cbNpcChat (msgin=..., serviceName=..., serviceId=...)
    at /home/mattr/sandbox/ryzom/code/ryzom/server/src/input_output_service/messages.cpp:1097
#11 0x081d8a93 in CMirror::updateMirrorAndReceiveMessages (this=0x851ec84, msgin=...)
    at /home/mattr/sandbox/ryzom/code/ryzom/common/src/game_share/mirror.cpp:2166
#12 0x081d96d8 in cbUpdateMirrorAndReceiveMessages (msgin=...) at /home/mattr/sandbox/ryzom/code/ryzom/common/src/game_share/mirror.cpp:2070
#13 0x0836dd64 in NLNET::uncbMsgProcessing (msgin=..., from=0x92984e8) at /home/mattr/sandbox/ryzom/code/nel/src/net/unified_network.cpp:396
#14 0x0838a0a5 in NLNET::CCallbackNetBase::processOneMessage (this=0x92cd860) at /home/mattr/sandbox/ryzom/code/nel/src/net/callback_net_base.cpp:216
#15 0x0838ad01 in NLNET::CCallbackNetBase::baseUpdate2 (this=0x92cd860, timeout=31, mintime=0)
    at /home/mattr/sandbox/ryzom/code/nel/src/net/callback_net_base.cpp:411
#16 0x083936c4 in NLNET::CCallbackClient::update2 (this=0x92cd860, timeout=31, mintime=0)
    at /home/mattr/sandbox/ryzom/code/nel/src/net/callback_client.cpp:128
#17 0x0837021e in NLNET::CUnifiedNetwork::update (this=0x8527c88, timeout=99) at /home/mattr/sandbox/ryzom/code/nel/src/net/unified_network.cpp:1078
#18 0x0811fa58 in NLNET::IService::main (this=0x851e608, serviceShortName=0x1f96d0c1 <Address 0x1f96d0c1 out of bounds>,
    serviceLongName=0x9430660 "(\313D\b\210GO\b\001\235\225\b", servicePort=0, configDir=0x851ea6c "", logDir=0xbffff2b4 "\264)R\b\363\003",
    compilationDate=0x1f96cecd <Address 0x1f96cecd out of bounds>) at /home/mattr/sandbox/ryzom/code/nel/src/net/service.cpp:1399
#19 0x08111640 in main (argc=5, argv=0xbffff6b4) at /home/mattr/sandbox/ryzom/code/ryzom/server/src/input_output_service/input_output_service.cpp:969
(gdb)

Thanks,
sfb
/s

RE: ios (input output service) chain crashing (Linux) - Added by cjlpa over 6 years ago

It is x64 Linux, and I have set both cmake options to ON as suggested in the wiki.

attached is my input_output_service.log

input_output_service.log - Input_Output_Service.log (172.5 kB)

RE: ios (input output service) chain crashing (Linux) - Added by sfb over 6 years ago

cjlpa,

This is interesting:

2012/05/10 19:40:38 WRN 2553005824 127.0.1.1/IOS-132 stdin_monitor_thread.cpp 84 run : fgets failed

Not sure what to make of this...

Thanks,
sfb
/s

RE: ios (input output service) chain crashing (Linux) - Added by cjlpa over 6 years ago

also, though I don't know if that is related, shard start has a lot of warnings:

@mkdir: cannot create directory `ras': File exists

---------------------------------------------------------------------------------
Starting service launcher
---------------------------------------------------------------------------------
CMDLINE = src/ryzom_admin_service/ryzom_admin_service --fulladminname=admin_service --shortadminname=AS C. -L. --nobreak --writepid
CTRL_FILE = ras/ras.launch_ctrl
NEXT_CTRL_FILE = ras/ras.deferred_launch_ctrl
STATE_FILE = ras/ras.state
--------------------------------------------------------------------------------

Press ENTER to launch program
cp: target `welcome_service_default.cfg' is not a directory
-----------------------------------------------------------------------
Launching ...
@

then later:

WRN fe756720 path.cpp 1463 insertFileInMap 127.0.1.1/AS-0 : PATH: CPath::insertFileInMap(mission_queues.txt, ./save_shard/mission_queues.txt, 0, txt): already inserted from './data_shard/', skip it
WRN fe756720 path.cpp 1463 insertFileInMap 127.0.1.1/AS-0 : PATH: CPath::insertFileInMap(ai_service.cfg, ./sheet_pack_cfg/ai_service.cfg, 0, cfg): already inserted from './', skip it
WRN fe756720 path.cpp 1463 insertFileInMap 127.0.1.1/AS-0 : PATH: CPath::insertFileInMap(entities_game_service.cfg, ./sheet_pack_cfg/entities_game_service.cfg, 0, cfg): already inserted from './', skip it
WRN fe756720 path.cpp 1463 insertFileInMap 127.0.1.1/AS-0 : PATH: CPath::insertFileInMap(gpm_service.cfg, ./sheet_pack_cfg/gpm_service.cfg, 0, cfg): already inserted from './', skip it
WRN fe756720 path.cpp 1463 insertFileInMap 127.0.1.1/AS-0 : PATH: CPath::insertFileInMap(input_output_service.cfg, ./sheet_pack_cfg/input_output_service.cfg, 0, cfg): already inserted from './', skip it
WRN fe756720 path.cpp 1463 insertFileInMap 127.0.1.1/AS-0 : PATH: CPath::insertFileInMap(mirror_service.cfg, ./sheet_pack_cfg/mirror_service.cfg, 0, cfg): already inserted from './', skip it
WRN fe756720 path.cpp 1463 insertFileInMap 127.0.1.1/AS-0 : PATH: CPath::insertFileInMap(CMakeLists.txt, ./src/CMakeLists.txt, 0, txt): already inserted from './', skip it
WRN fe756720 path.cpp 1463 insertFileInMap 127.0.1.1/AS-0 : PATH: CPath::insertFileInMap(CMakeLists.txt, ./src/admin_modules/CMakeLists.txt, 0, txt): already inserted from './', skip it
WRN fe756720 path.cpp 1463 insertFileInMap 127.0.1.1/AS-0 : PATH: CPath::insertFileInMap(CMakeLists.txt, ./src/ai_data_service/CMakeLists.txt, 0, txt): already inserted from './', skip it
WRN fe756720 path.cpp 1463 insertFileInMap 127.0.1.1/AS-0 : PATH: CPath::insertFileInMap(commands.cpp, ./src/ai_data_service/commands.cpp, 0, cpp): already inserted from './src/ags_test/', skip it
WRN fe756720 path.cpp 1463 insertFileInMap 127.0.1.1/AS-0 : PATH: CPath::insertFileInMap(messages.cpp, ./src/ai_data_service/messages.cpp, 0, cpp): already inserted from './src/ags_test/', skip it
WRN fe756720 path.cpp 1463 insertFileInMap 127.0.1.1/AS-0 : PATH: CPath::insertFileInMap(messages.h, ./src/ai_data_service/messages.h, 0, h): already inserted from './src/ags_test/', skip it
WRN fe756720 path.cpp 1463 insertFileInMap 127.0.1.1/AS-0 : PATH: CPath::insertFileInMap(notes.txt, ./src/ai_data_service/notes.txt, 0, txt): already inserted from './src/ags_test/', skip it
WRN fe756720 path.cpp 1463 insertFileInMap 127.0.1.1/AS-0 : PATH: CPath::insertFileInMap(service_main.cpp, ./src/ai_data_service/service_main.cpp, 0, cpp): already inserted from './src/ags_test/', skip it
WRN fe756720 path.cpp 1463 insertFileInMap 127.0.1.1/AS-0 : PATH: CPath::insertFileInMap(CMakeLists.txt, ./src/ai_service/CMakeLists.txt, 0, txt): already inserted from './', skip it
....

RE: ios (input output service) chain crashing (Linux) - Added by sfb over 6 years ago

The insertFileInMap warnings are OK. Most of the time warnings are OK, unfortunately.

As a complete side note this is because CPath is indexing your source tree and the CPath lookup is fairly flat. So there can only be one file with a specific file name and it is first come first serve. In this case CMakeLists.txt is a very common file.

Thanks,
sfb
/s

RE: ios (input output service) chain crashing (Linux) - Added by cjlpa over 6 years ago

the only fgets in stdin_monitor_thread.cpp is:

void CStdinMonitorThread::run () {
while(!feof(stdin)) {
// wait for the main thread to deal with the previous command
while (commandWaiting()) {
NLMISC::nlSleep(1);
}

// get the next command from the command line
std::string theCommand;
theCommand.resize(1024,0);
fgets((char*)theCommand.c_str(),theCommand.size()-1,stdin);
theCommand.resize(strlen(theCommand.c_str()));
// push the command to allow reader thread to deal with it
pushCommand(theCommand);
}
}

RE: ios (input output service) chain crashing (Linux) - Added by sfb over 6 years ago

cjlpa,

Could you humor me and edit code/ryzom/common/data_leveldesign/primitives/primitives.cfg and comment out urban_newbieland.primitive in the primitive map and then restart your shard?

Let me know if it keeps crashing.

Thanks,
sfb
/s

RE: ios (input output service) chain crashing (Linux) - Added by cjlpa over 6 years ago

This seems to have fixed it - I have a record IOS uptime of 6 minutes and stil counting!

Thank you so much for your help! I must say I am really impressed by the helpfulness of the Ryzom forum people - I've hardly ever received such quick and helpful replies!

Regards,
F.

RE: ios (input output service) chain crashing (Linux) - Added by sfb over 6 years ago

cjlpa,

It's not fixed. It has just narrowed down the problem for us. The problem is with Chiang's NPC scripting that makes him say stuff. I'm not positive where it is but I followed an UMM message that was adjusted as an NPC_CHAT callback in the IOS. When it processed it through cbNpcChat it attempted to extract a phraseId and failed.

What I've done is given you an NPC-less workaround until we figure out what's going on.

Thanks!
sfb
/s

RE: ios (input output service) chain crashing (Linux) - Added by Max_De_Groot about 6 years ago

I have the same problem. Given workaround helps prevent crash, so I tried to replace the urban_newbieland.primitive with the one provided with the pre-built core server as that server works just fine. Unfortunately, the problem remains even then. That takes me to an off topic question; if you change the world, do you need to update client files as well? If so, what client files need to be changed?

RE: ios (input output service) chain crashing (Linux) - Added by Max_De_Groot about 6 years ago

At the risk of sounding stupid...

fgets((char*)theCommand.c_str(),theCommand.size()-1,stdin);

FGets is supposed to get the entire command, including NULL for end string?
If so, shouldn't the length of read chars be .size()? (As it includes the NULL this way?)

I assume the command that's to be read fits in the set size : theCommand.resize(1024,0);

Been a long time since I did any C++ programming...(over 10 years)
so if I'm wrong about this please correct me...

RE: ios (input output service) chain crashing (Linux) - Added by Max_De_Groot almost 6 years ago

Any progress yet on what the problem is?
If Chiang is removed... would that solve the problem ? or is it
a problem in the code?

(1-17/17)