This project is neither sponsored nor supported by NVIDIA.
Use of NVIDIA NVSHMEM is governed by the terms at NVSHMEM Software License Agreement.
Hardware requirements:
- GPUs inside one node needs to be connected by NVLink
- GPUs across different nodes needs to be connected by RDMA devices, see GPUDirect RDMA Documentation
- InfiniBand GPUDirect Async (IBGDA) support, see IBGDA Overview
- For more detailed requirements, see NVSHMEM Hardware Specifications
Download NVSHMEM source code from the NVIDIA NVSHMEM OPEN SOURCE PACKAGES.
NOTE: After NVSHMEM v3.3.9, it is no longer necessary to apply our patch to achieve optimal performance.
Navigate to your NVSHMEM source directory and apply our provided patch:
git apply /path/to/deep_ep/dir/third-party/nvshmem.patchEnable IBGDA by modifying /etc/modprobe.d/nvidia.conf:
options nvidia NVreg_EnableStreamMemOPs=1 NVreg_RegistryDwords="PeerMappingOverride=1;"Update kernel configuration:
sudo update-initramfs -u
sudo rebootFor more detailed configurations, please refer to the NVSHMEM Installation Guide.
DeepEP uses NVLink for intra-node communication and IBGDA for inter-node communication. All the other features are disabled to reduce the dependencies.
export CUDA_HOME=/path/to/cuda
# disable all features except IBGDA
export NVSHMEM_IBGDA_SUPPORT=1
export NVSHMEM_SHMEM_SUPPORT=0
export NVSHMEM_UCX_SUPPORT=0
export NVSHMEM_USE_NCCL=0
export NVSHMEM_PMIX_SUPPORT=0
export NVSHMEM_TIMEOUT_DEVICE_POLLING=0
export NVSHMEM_USE_GDRCOPY=0
export NVSHMEM_IBRC_SUPPORT=0
export NVSHMEM_BUILD_TESTS=0
export NVSHMEM_BUILD_EXAMPLES=0
export NVSHMEM_MPI_SUPPORT=0
export NVSHMEM_BUILD_HYDRA_LAUNCHER=0
export NVSHMEM_BUILD_TXZ_PACKAGE=0
export NVSHMEM_TIMEOUT_DEVICE_POLLING=0
cmake -G Ninja -S . -B build -DCMAKE_INSTALL_PREFIX=/path/to/your/dir/to/install
cmake --build build/ --target installSet environment variables in your shell configuration:
export NVSHMEM_DIR=/path/to/your/dir/to/install # Use for DeepEP installation
export LD_LIBRARY_PATH="${NVSHMEM_DIR}/lib:$LD_LIBRARY_PATH"
export PATH="${NVSHMEM_DIR}/bin:$PATH"nvshmem-info -a # Should display details of nvshmem