Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Py_NewInterpreterFromConfig end with hard crash #123488

Closed as not planned
Closed as not planned
@aotto1968

Description

@aotto1968

Crash report

What happened?

I started using Py_NewInterpreterFromConfig to add an PER-THREAD interpreter with my library.
The library is thread safe and supports many languages

  • C C++ JAVA C# VB.Net Perl Python GO TCL Ruby Php

including other languages with thread support

  • C C++ JAVA C# TCL GO

are working fine. Even python works fine without thread support.

Task: now I started to use thread support in python.
I had to change my code to to Multi-phase initialization, isolate the python data on an per-thread-level and finally get an easy thread example working -> ok

I test a client server application. the server create a thread in my library and this thread is initialized with

  if (create == MQ_FACTORY_NEW_THREAD) {
//mk_debug_color(MK_YELLOW,"%s","MQ_FACTORY_NEW_THREAD");
    static PyInterpreterConfig config = {
        .use_main_obmalloc = 0,
        .allow_fork = 0,
        .allow_exec = 0,
        .allow_threads = 1,
        .allow_daemon_threads = 0,
        .check_multi_interp_extensions = 1,
        .gil = PyInterpreterConfig_OWN_GIL,
    };

    PyThreadState *tstate = NULL;
    PyStatus status = Py_NewInterpreterFromConfig(&tstate, &config);
    MK_RT_REF.threadData = tstate;

//printV("#2 PyThreadState_Get<%p>", PyThreadState_Get())

    if (PyStatus_Exception(status)) {
      return MkErrorSetC_3_NULL(status.err_msg, status.func, 300+status.exitcode);
    }

    // mark new interpreter as "MQ_STARTUP_IS_THREAD"
    PyObject *mainM = PyImport_AddModule("__main__");
    PyObject *strO = PyUnicode_FromString("MQ_STARTUP_IS_THREAD");
    PyObject_SetAttrString(mainM,"__name__", strO);
    Py_CLEAR(strO);
    // source initial script
    FILE *FH = fopen(MqInitGetArg0()->data[1]->storage.first.C, "r");
    check_LNG(PyRun_SimpleFileEx(FH,MqInitGetArg0()->data[1]->storage.first.C,true)) {
      return MkErrorSetV_1E("script '%s' failed",MqInitGetArg0()->data[1]->storage.first.C);
    }
  }

I figure out that always after 4 successful simple round trip clinet → server → client the server crash with the message below

  • The stack-trace comes from my MqDisasterSignal handler.
exec[#6] -> 'NHI1_EXT/debug/bin/python3' 'NHI1_HOME/example/py/MyServer.py' '--thread' '--uds' '--file' '/tmp/test.uds'
Debug memory block at address p=0x7f0012d1a730: API '^@'
    18302063728033398269 bytes originally requested
    The 7 pad bytes at p-7 are not all FORBIDDENBYTE (0xfd):
        at p-7: 0x00 *** OUCH
        at p-6: 0x00 *** OUCH
        at p-5: 0x00 *** OUCH
        at p-4: 0x00 *** OUCH
        at p-3: 0x00 *** OUCH
        at p-2: 0x00 *** OUCH
        at p-1: 0x00 *** OUCH
    Because memory is corrupted at the start, the count of bytes requested
       may be bogus, and checking the trailing pad bytes may segfault.
    The 8 pad bytes at tail=0xfdfe7cfe10cfa52d are X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }: BackTrace {
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ library              : filename                       : lineno ] function
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ -------              : --------                       : ------ ] --------
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ theLink              : c/sys_mq.c                     : 705    ] MqDisasterSignal
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ unknown              : unknown                        : 0      ] unknown
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Objects/obmalloc.c             : 2408   ] _PyObject_DebugDumpAddress
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Objects/obmalloc.c             : 2326   ] _PyMem_DebugCheckAddress
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Objects/obmalloc.c             : 2159   ] _PyMem_DebugRawFree
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Objects/obmalloc.c             : 685    ] PyMem_RawFree
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Objects/obmalloc.c             : 1853   ] _PyObject_Free
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Objects/obmalloc.c             : 2163   ] _PyMem_DebugRawFree
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Objects/obmalloc.c             : 2296   ] _PyMem_DebugFree
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Objects/obmalloc.c             : 830    ] PyObject_Free
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Objects/dictobject.c           : 1569   ] dictresize
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Objects/dictobject.c           : 1194   ] insertion_resize
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Objects/dictobject.c           : 1261   ] insertdict
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Objects/dictobject.c           : 1865   ] _PyDict_SetItem_Take2
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Objects/dictobject.c           : 1883   ] PyDict_SetItem
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Objects/typeobject.c           : 7618   ] add_subclass
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Objects/typeobject.c           : 7352   ] type_ready_add_subclasses
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Objects/typeobject.c           : 7515   ] type_ready
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Objects/typeobject.c           : 7553   ] PyType_Ready
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Objects/typeobject.c           : 3795   ] type_new_impl
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Objects/typeobject.c           : 3929   ] type_new
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Objects/typeobject.c           : 1664   ] type_call
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Objects/call.c                 : 240    ] _PyObject_MakeTpCall
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Objects/typeobject.c           : 3949   ] type_vectorcall
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Objects/call.c                 : 133    ] _PyObject_FastCallDictTstate
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Objects/call.c                 : 157    ] PyObject_VectorcallDict
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Python/bltinmodule.c           : 208    ] builtin___build_class__
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Objects/methodobject.c         : 438    ] cfunction_vectorcall_FASTCALL_KEYWORDS
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Include/internal/pycore_call.h : 92     ] _PyObject_VectorcallTstate
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Objects/call.c                 : 325    ] PyObject_Vectorcall
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Python/bytecodes.c             : 2714   ] _PyEval_EvalFrameDefault
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : nclude/internal/pycore_ceval.h : 89     ] _PyEval_EvalFrame
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Python/ceval.c                 : 1683   ] _PyEval_Vector
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Python/ceval.c                 : 578    ] PyEval_EvalCode
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Python/pythonrun.c             : 1722   ] run_eval_code_obj
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Python/pythonrun.c             : 1743   ] run_mod
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Python/pythonrun.c             : 1643   ] pyrun_file
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Python/pythonrun.c             : 433    ] _PyRun_SimpleFileObject
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ system               : Python/pythonrun.c             : 466    ] PyRun_SimpleFileExFlags
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ theLink              : py/MqFactoryC_py.c             : 195    ] py_mqmsgque_sFactoryCTor
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ theLink              : c/MqFactoryS_mq.c              : 289    ] MqFactoryInvoke_RT
X> {PRINT               :pid(66272):tid(0x7f000f479700):X:dlv(0):ctxId( 0):rc(1):ctx(0x1b87fb8     ):MqDisasterSignal              }:   [ theLink              : c/sys_mq.c                     : 269    ] MqSysServerThreadMain

The server code (file sourced into the new interpreter) is quite simple.

# Add a code block here, if required
import sys
from pymqmsgque import *

# package-item
class MyServer(MqContextC):
  
  # factory startup
  def __init__(self, tmpl=None):
    self.ConfigSetServerSetup(self.ServerSetup)
    super().__init__(tmpl)
    
  # service to serve all incoming requests for token "HLWO"
  def MyFirstService(self):
    self.SendSTART()
    self.SendSTR(self.ReadSTR() + " World")
    self.SendRETURN()
      
  # define a service as link between the token "HLWO" and the callback "MyFirstService"
  def ServerSetup(self):
    self.ServiceCreate("HLWO",self.MyFirstService)
  
# package-main
if __name__ == "__main__":

  # create the "MyServer" factory… and the object
  srv = MqFactoryC.Add(MyServer).New()

  try:
    srv.LinkCreate(sys.argv)
    srv.ProcessEvent(MqWaitOnEventE.FOREVER)
  except Exception as ex:
    srv.ErrorCatch(ex)
  finally:
    srv.Exit()

the problem is that running this example with a server with valgrind nothing happen (no crash)
the same example using fork or using spawn to startup the server instances works fine.
the problem is only with thread support.

additional Information: because my library only support one server-instance per thread, I choose PyInterpreterConfig_OWN_GIL to be clear that I don't want to have interaction between different python interpreters.
all the communication (between processes and threads) is done by my library. my library uses thread-local-storage to isolate the data per-thread.

CPython versions tested on:

3.12

Operating systems tested on:

Linux

Output from running 'python -VV' on the command line:

Python 3.12.4 (main, Aug 29 2024, 18:00:55) [GCC 13.3.0]

Metadata

Metadata

Assignees

No one assigned

    Labels

    type-crashA hard crash of the interpreter, possibly with a core dump

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions