-
Notifications
You must be signed in to change notification settings - Fork 325
Closed
Description
In some of our ML pipelines we still have dependencies on CUDA 10 while others use CUDA 11. This is an automated build pipeline and as of a couple of days ago the CUDA 10 docker image with the nccl-tests fails to compile common.o in the nccl-tests in the sm_70 gencode
Step 15/16 : RUN git clone https://github.com/NVIDIA/nccl-tests.git $HOME/nccl-tests && cd $HOME/nccl-tests && git checkout ${NCCL_TESTS_VERSION} && make MPI=1 MPI_HOME=/opt/amazon/openmpi/ CUDA_HOME=/usr/local/cuda NCCL_HOME=/opt/nccl/build NVCC_GENCODE="-gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_70,code=sm_70"
---> Running in 01fdf75d8d36
Cloning into '/tmp/nccl-tests'...
Already on 'master'
Your branch is up to date with 'origin/master'.
make -C src build
make[1]: Entering directory '/tmp/nccl-tests/src'
Compiling all_reduce.cu > ../build/all_reduce.o
Compiling common.cu > ../build/common.o
common.cu(542): error: identifier "cudaStreamCaptureModeThreadLocal" is undefined
common.cu(542): error: identifier "cudaStreamCaptureModeGlobal" is undefined
common.cu(542): error: too many arguments in function call
common.cu(606): error: identifier "cudaStreamCaptureModeThreadLocal" is undefined
common.cu(606): error: too many arguments in function call
5 errors detected in the compilation of "/tmp/tmpxft_0000004b_00000000-7_common.compute_70.cpp1.ii".
make[1]: *** [../build/common.o] Error 1
Makefile:82: recipe for target '../build/common.o' failed
make[1]: Leaving directory '/tmp/nccl-tests/src'
make: *** [src.build] Error 2
Metadata
Metadata
Assignees
Labels
No labels