RuntimeError was encountered while calling torch._C._cuda_init()

**Problem description**
Run the SCUDA server on a GPU server, then run commands. RuntimeError was encountered.

**Environmental information**
CUDA_VERSION=12.6.2
DISTRO_VERSION=24.04
OS_DISTRO=ubuntu
CUDNN_TAG=cudnn

**Reproduce steps**
1. Build an image using the example [dockerfile](https://github.com/kevmo314/scuda/blob/29026b0dbe0716bc86762655cc706eb303f5deb5/Dockerfile.build) and start the container
2. Execute the command, `pip install numpy pandas torch`
3. Use the command to start the server ./local.sh server
4. Set environment var export SCUDA_SERVER=127.0.0.1
5. Use the command to start the client LD_PRELOAD=./libscuda_12.6.so python3 test.py
My test.py file like this:
```python
import torch
print(torch.cuda.is_available())
print(torch.cuda.get_device_name())
```

**Current behavior**
The output like this:
```
......
dlsym: cuModuleGetGlobal_v2
dlsym: PyInit__C
dlsym: PyInit__multiarray_umath
dlsym: PyInit__contextvars
dlsym: PyInit__umath_linalg
dlsym: PyInit_mmap
dlsym: PyInit__ssl
dlsym: PyInit__asyncio
dlsym: PyInit__queue
dlsym: PyInit__hashlib
dlsym: PyInit__multiprocessing
dlsym: cuDevicePrimaryCtxGetState
dlsym: cuGetErrorString
True
Traceback (most recent all last):
  File "<string>", line 1, in <module>
  ...
  File ".../site-packages/torch/cuda/__init__.py", live 372, in _lazy_init
    torch._C._cuda_init()
RuntimeError: CUDA driver error: initialization error
```
It seems that `torch.cuda.is_available()` works normally, but cuda can not be initialized in fact.

I tried to run the script bellow without scuda method, it gave me the correct result. However, the runtime error will be encountered if you use `LD_PRELOAD=./libscuda_12.6.so` to run it. I guess there was a problem using RPC to implement CUDA's C interface.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RuntimeError was encountered while calling torch._C._cuda_init() #111

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

RuntimeError was encountered while calling torch._C._cuda_init() #111

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions