Description
Crash report
What happened?
I started using Py_NewInterpreterFromConfig to add an PER-THREAD interpreter with my library.
The library is thread safe and supports many languages
- C C++ JAVA C# VB.Net Perl Python GO TCL Ruby Php
including other languages with thread support
- C C++ JAVA C# TCL GO
are working fine. Even python works fine without thread support.
Task: now I started to use thread support in python.
I had to change my code to to Multi-phase initialization, isolate the python data on an per-thread-level and finally get an easy thread example working -> ok
I test a client server application. the server create a thread in my library and this thread is initialized with
if (create == MQ_FACTORY_NEW_THREAD) {
//mk_debug_color(MK_YELLOW,"%s","MQ_FACTORY_NEW_THREAD");
static PyInterpreterConfig config = {
.use_main_obmalloc = 0,
.allow_fork = 0,
.allow_exec = 0,
.allow_threads = 1,
.allow_daemon_threads = 0,
.check_multi_interp_extensions = 1,
.gil = PyInterpreterConfig_OWN_GIL,
};
PyThreadState *tstate = NULL;
PyStatus status = Py_NewInterpreterFromConfig(&tstate, &config);
MK_RT_REF.threadData = tstate;
//printV("#2 PyThreadState_Get<%p>", PyThreadState_Get())
if (PyStatus_Exception(status)) {
return MkErrorSetC_3_NULL(status.err_msg, status.func, 300+status.exitcode);
}
// mark new interpreter as "MQ_STARTUP_IS_THREAD"
PyObject *mainM = PyImport_AddModule("__main__");
PyObject *strO = PyUnicode_FromString("MQ_STARTUP_IS_THREAD");
PyObject_SetAttrString(mainM,"__name__", strO);
Py_CLEAR(strO);
// source initial script
FILE *FH = fopen(MqInitGetArg0()->data[1]->storage.first.C, "r");
check_LNG(PyRun_SimpleFileEx(FH,MqInitGetArg0()->data[1]->storage.first.C,true)) {
return MkErrorSetV_1E("script '%s' failed",MqInitGetArg0()->data[1]->storage.first.C);
}
}
I figure out that always after 4 successful simple round trip clinet → server → client the server crash with the message below
- The stack-trace comes from my
MqDisasterSignal
handler.
exec[#6] -> 'NHI1_EXT/debug/bin/python3' 'NHI1_HOME/example/py/MyServer.py' '--thread' '--uds' '--file' '/tmp/test.uds'
Debug memory block at address p=0x7f0012d1a730: API '^@'
18302063728033398269 bytes originally requested
The 7 pad bytes at p-7 are not all FORBIDDENBYTE (0xfd):
at p-7: 0x00 *** OUCH
at p-6: 0x00 *** OUCH
at p-5: 0x00 *** OUCH
at p-4: 0x00 *** OUCH
at p-3: 0x00 *** OUCH
at p-2: 0x00 *** OUCH
at p-1: 0x00 *** OUCH
Because memory is corrupted at the start, the count of bytes requested
may be bogus, and checking the trailing pad bytes may segfault.
The 8 pad bytes at tail=0xfdfe7cfe10cfa52d are X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: BackTrace {
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ library : filename : lineno ] function
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ ------- : -------- : ------ ] --------
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ theLink : c/sys_mq.c : 705 ] MqDisasterSignal
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ unknown : unknown : 0 ] unknown
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Objects/obmalloc.c : 2408 ] _PyObject_DebugDumpAddress
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Objects/obmalloc.c : 2326 ] _PyMem_DebugCheckAddress
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Objects/obmalloc.c : 2159 ] _PyMem_DebugRawFree
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Objects/obmalloc.c : 685 ] PyMem_RawFree
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Objects/obmalloc.c : 1853 ] _PyObject_Free
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Objects/obmalloc.c : 2163 ] _PyMem_DebugRawFree
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Objects/obmalloc.c : 2296 ] _PyMem_DebugFree
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Objects/obmalloc.c : 830 ] PyObject_Free
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Objects/dictobject.c : 1569 ] dictresize
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Objects/dictobject.c : 1194 ] insertion_resize
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Objects/dictobject.c : 1261 ] insertdict
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Objects/dictobject.c : 1865 ] _PyDict_SetItem_Take2
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Objects/dictobject.c : 1883 ] PyDict_SetItem
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Objects/typeobject.c : 7618 ] add_subclass
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Objects/typeobject.c : 7352 ] type_ready_add_subclasses
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Objects/typeobject.c : 7515 ] type_ready
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Objects/typeobject.c : 7553 ] PyType_Ready
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Objects/typeobject.c : 3795 ] type_new_impl
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Objects/typeobject.c : 3929 ] type_new
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Objects/typeobject.c : 1664 ] type_call
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Objects/call.c : 240 ] _PyObject_MakeTpCall
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Objects/typeobject.c : 3949 ] type_vectorcall
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Objects/call.c : 133 ] _PyObject_FastCallDictTstate
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Objects/call.c : 157 ] PyObject_VectorcallDict
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Python/bltinmodule.c : 208 ] builtin___build_class__
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Objects/methodobject.c : 438 ] cfunction_vectorcall_FASTCALL_KEYWORDS
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Include/internal/pycore_call.h : 92 ] _PyObject_VectorcallTstate
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Objects/call.c : 325 ] PyObject_Vectorcall
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Python/bytecodes.c : 2714 ] _PyEval_EvalFrameDefault
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : nclude/internal/pycore_ceval.h : 89 ] _PyEval_EvalFrame
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Python/ceval.c : 1683 ] _PyEval_Vector
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Python/ceval.c : 578 ] PyEval_EvalCode
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Python/pythonrun.c : 1722 ] run_eval_code_obj
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Python/pythonrun.c : 1743 ] run_mod
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Python/pythonrun.c : 1643 ] pyrun_file
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Python/pythonrun.c : 433 ] _PyRun_SimpleFileObject
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ system : Python/pythonrun.c : 466 ] PyRun_SimpleFileExFlags
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ theLink : py/MqFactoryC_py.c : 195 ] py_mqmsgque_sFactoryCTor
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ theLink : c/MqFactoryS_mq.c : 289 ] MqFactoryInvoke_RT
X> {PRINT :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8 ):MqDisasterSignal }: [ theLink : c/sys_mq.c : 269 ] MqSysServerThreadMain
The server code (file sourced into the new interpreter) is quite simple.
# Add a code block here, if required
import sys
from pymqmsgque import *
# package-item
class MyServer(MqContextC):
# factory startup
def __init__(self, tmpl=None):
self.ConfigSetServerSetup(self.ServerSetup)
super().__init__(tmpl)
# service to serve all incoming requests for token "HLWO"
def MyFirstService(self):
self.SendSTART()
self.SendSTR(self.ReadSTR() + " World")
self.SendRETURN()
# define a service as link between the token "HLWO" and the callback "MyFirstService"
def ServerSetup(self):
self.ServiceCreate("HLWO",self.MyFirstService)
# package-main
if __name__ == "__main__":
# create the "MyServer" factory… and the object
srv = MqFactoryC.Add(MyServer).New()
try:
srv.LinkCreate(sys.argv)
srv.ProcessEvent(MqWaitOnEventE.FOREVER)
except Exception as ex:
srv.ErrorCatch(ex)
finally:
srv.Exit()
the problem is that running this example with a server with valgrind nothing happen (no crash)
the same example using fork or using spawn to startup the server instances works fine.
the problem is only with thread support.
additional Information: because my library only support one server-instance per thread, I choose PyInterpreterConfig_OWN_GIL
to be clear that I don't want to have interaction between different python interpreters.
all the communication (between processes and threads) is done by my library. my library uses thread-local-storage to isolate the data per-thread.
CPython versions tested on:
3.12
Operating systems tested on:
Linux
Output from running 'python -VV' on the command line:
Python 3.12.4 (main, Aug 29 2024, 18:00:55) [GCC 13.3.0]