-
-
Notifications
You must be signed in to change notification settings - Fork 971
Closed
jemiryguo/cupy
#1Labels
Description
Description
in class NCCLBackend(_Backend) there is function _get_op:
def _get_op(self, op, dtype):
if op not in _nccl_ops:
raise RuntimeError(f'Unknown op {op} for NCCL')
if dtype in 'FD' and op != nccl.NCCL_SUM:
raise ValueError(
'Only nccl.SUM is supported for complex arrays')
return _nccl_ops[op]op is designed to be in 'sum', 'prod', 'max', and 'min' according to the defination of _nccl_ops:
_nccl_ops = {'sum': nccl.NCCL_SUM,
'prod': nccl.NCCL_PROD,
'max': nccl.NCCL_MAX,
'min': nccl.NCCL_MIN}However the ValueError will be raised if op != nccl.NCCL_SUM, which should be corrected to op != 'sum'.
To Reproduce
import cupy, os
from cupyx.distributed import NCCLBackend
os.unsetenv("NCCL_DEBUG")
NCCLBackend._get_op(None, 'sum', 'D')Installation
Conda-Forge (conda install ...)
Environment
OS : Linux-5.4.143.bsk.7-amd64-x86_64-with-glibc2.31
Python Version : 3.10.9
CuPy Version : 11.5.0
CuPy Platform : NVIDIA CUDA
NumPy Version : 1.24.2
SciPy Version : 1.10.1
Cython Build Version : 0.29.33
Cython Runtime Version : None
CUDA Root : /usr/local/cuda
nvcc PATH : /usr/local/cuda/bin/nvcc
CUDA Build Version : 11020
CUDA Driver Version : 11080
CUDA Runtime Version : 11080
cuBLAS Version : (available)
cuFFT Version : 10900
cuRAND Version : 10300
cuSOLVER Version : (11, 4, 1)
cuSPARSE Version : (available)
NVRTC Version : (11, 8)
Thrust Version : 101000
CUB Build Version : 101000
Jitify Build Version : b8d229d
cuDNN Build Version : 8401
cuDNN Version : 8600
NCCL Build Version : 21403
NCCL Runtime Version : 21501
cuTENSOR Version : 10602
cuSPARSELt Build Version : None
Device 0 Name : NVIDIA A100-SXM4-80GB
Device 0 Compute Capability : 80
Device 0 PCI Bus ID : 0000:16:00.0
Additional Information
I can push PR to fix this.