Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[RFC] Threading #1831

Closed
Closed
@palaviv

Description

@palaviv

Summary

I think we have got to the stage we should support threading in RustPython.

Detailed Explanation

There was already some discussion on threading but I wanted to start a dedicated issue as it is a big decision. I will try to break the issue in order to make some order:

Definition of done

We need a user to be able to create a new thread in python and execute code on the same Python object from multiple threads:

import threading

def worker(num):
    """thread worker function"""
    print 'Worker: %s' % num
    return

threads = []
for i in range(5):
    t = threading.Thread(target=worker, args=(i,))
    threads.append(t)
    t.start()

GIL

I think that one of the biggest discussion is the GIL. Should we add a GIL to RustPython? Should we allow only one thread to use each object at a time?

Suggested approach

I suggest the following changes in order to reach the ability to spawn a new thread:

  • I suggest we will create a new VirtualMachine for each thread. Thus making VirtualMachine !sync and !Send.
  • Use Arc instead of Rc in PyObjectRef. pub type PyObjectRef = Arc<PyObject<dyn PyObjectPayload>>
  • PyValue and PyObjectPayload traits will implement Sync and Send. Thus forcing us to make all Py* structs sync as well.
  • We will need to convert most of the code Rc, Cell and RefCell use to thread safe options. For example Arc, Mutex and Atomic*.
  • When accessing data of an Object wrapped in Mutex we will lock the mutex for our use. This will require careful handling when using internal objects to avoid deadlocks when used between threads.

A simple example of the start_new_thread method in _thread will look something like:

fn start_new_thread(func: PyFunctionRef, args: PyFuncArgs, vm: &VirtualMachine) -> u64 {
    let handle = thread::spawn(move || {
        let thread_vm = VirtualMachine::new(PySettings::default()); // Should get some params from original VM
        thread_vm.invoke(func.as_object(), args).unwrap();
    });
    get_id(handle.thread())
}

Drawbacks, Rationale, and Alternatives

  • I suggest that we will ignore the GIL and allow multiple threads to execute python code simultaneously. This can be a big advantage of RustPython on CPython. The user will be expected to make his own code thread safe. This might prove problematic to code that rely on the GIL.

  • What about third party library's? We do not yet have an API but would we force them to implement Sync and Send as well?

  • There was a lot of talk about alternatives and I cannot find all so please fill free to add in the comments. One alternatives that was suggested is Crossbream.

Unresolved Questions

  • How can we do this change in small steps?
  • How to test for deadlocks?
  • Is this the right time to do this?

Metadata

Metadata

Assignees

No one assigned

    Labels

    E-help-wantedExtra attention is neededRFCRequest for comments

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions