Thanks to visit codestin.com
Credit goes to github.com

Skip to content

RuntimeError was encountered while calling torch._C._cuda_init() #111

@James-Leong

Description

@James-Leong

Problem description
Run the SCUDA server on a GPU server, then run commands. RuntimeError was encountered.

Environmental information
CUDA_VERSION=12.6.2
DISTRO_VERSION=24.04
OS_DISTRO=ubuntu
CUDNN_TAG=cudnn

Reproduce steps

  1. Build an image using the example dockerfile and start the container
  2. Execute the command, pip install numpy pandas torch
  3. Use the command to start the server ./local.sh server
  4. Set environment var export SCUDA_SERVER=127.0.0.1
  5. Use the command to start the client LD_PRELOAD=./libscuda_12.6.so python3 test.py
    My test.py file like this:
import torch
print(torch.cuda.is_available())
print(torch.cuda.get_device_name())

Current behavior
The output like this:

......
dlsym: cuModuleGetGlobal_v2
dlsym: PyInit__C
dlsym: PyInit__multiarray_umath
dlsym: PyInit__contextvars
dlsym: PyInit__umath_linalg
dlsym: PyInit_mmap
dlsym: PyInit__ssl
dlsym: PyInit__asyncio
dlsym: PyInit__queue
dlsym: PyInit__hashlib
dlsym: PyInit__multiprocessing
dlsym: cuDevicePrimaryCtxGetState
dlsym: cuGetErrorString
True
Traceback (most recent all last):
  File "<string>", line 1, in <module>
  ...
  File ".../site-packages/torch/cuda/__init__.py", live 372, in _lazy_init
    torch._C._cuda_init()
RuntimeError: CUDA driver error: initialization error

It seems that torch.cuda.is_available() works normally, but cuda can not be initialized in fact.

I tried to run the script bellow without scuda method, it gave me the correct result. However, the runtime error will be encountered if you use LD_PRELOAD=./libscuda_12.6.so to run it. I guess there was a problem using RPC to implement CUDA's C interface.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions