-
Notifications
You must be signed in to change notification settings - Fork 24.1k
Install pytorch from pypi using local CUDA build #150742
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@ikrommyd thank you for your suggestion. Looks reasonable. By the way, have you tried using |
@malfet |
Huh that's interesting, here I am in an environment without any nvidia packages installed and just torch from pypi. CUDA 12.6 system wide installation and my own build of nccl under (ak-uproot) β ~ echo $LD_LIBRARY_PATH
/usr/local/cuda-12.6/lib64:/home/iason/software/torch/nccl/build/lib
(ak-uproot) β ~ ipython
iPython 3.12.9 | packaged by conda-forge | (main, Feb 14 2025, 08:00:06) [GCC 13.3.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.32.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import torch
In [2]: torch.cuda.is_available()
Out[2]: True
In [3]: torch.cuda.is_bf16_supported()
Out[3]: True
In [4]: x = torch.rand(5, 3)
In [5]: print(x)
tensor([[0.3051, 0.5993, 0.2161],
[0.1847, 0.2640, 0.4210],
[0.0208, 0.4589, 0.3539],
[0.5538, 0.0826, 0.1458],
[0.3174, 0.1281, 0.6403]])
In [6]: import torch
...: import torch.nn as nn
...:
...: # Check if CUDA is available
...: device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
...: print(f"Using device: {device}")
...:
...: # Set parameters
...: batch_size = 8
...: seq_length = 10
...: embed_dim = 512
...: num_heads = 8
...:
...: # Create random input tensors
...: # Shape: [sequence length, batch size, embedding dimension]
...: query = torch.randn(seq_length, batch_size, embed_dim, device=device)
...: key = torch.randn(seq_length, batch_size, embed_dim, device=device)
...: value = torch.randn(seq_length, batch_size, embed_dim, device=device)
...:
...: # Create MultiheadAttention module
...: mha = nn.MultiheadAttention(embed_dim, num_heads).to(device)
...:
...: # Forward pass
...: output, attn_weights = mha(query, key, value)
...:
...: # Print shapes
...: print(f"Input shapes: query={query.shape}, key={key.shape}, value={value.shape}")
...: print(f"Output shape: {output.shape}")
...: print(f"Attention weights shape: {attn_weights.shape}")
Using device: cuda
Input shapes: query=torch.Size([10, 8, 512]), key=torch.Size([10, 8, 512]), value=torch.Size([10, 8, 512])
Output shape: torch.Size([10, 8, 512])
Attention weights shape: torch.Size([8, 10, 10])
In [7]: query.device
Out[7]: device(type='cuda', index=0)
In [8]: exit()
(ak-uproot) β ~ pip list | grep torch
torch 2.6.0+cu126
torchaudio 2.6.0+cu126
torchvision 0.21.0+cu126
(ak-uproot) β ~ pip list | grep nvidia
(ak-uproot) β ~ |
I'd like to say then that this should probably be tested more officially (in ci perhaps) and advertised more in the installation instructions. Perhaps the existence of I don't understand why |
To add one more comment to this, it would be really nice if a support matrix or something similar is provided of what the user needs to install on their system (libraries from nvidia or others and supported versions) to install pytorch without the nvidia dependencies. |
π The feature, motivation and pitch
It's great that nvidia provides wheels for the CUDA related packages and we don't need
conda/mamba
to install pytorch anymore, but those packages take up space if you install pytorch in multiple environments.I would be nice if you could install a pytorch version from pypi that could grab and use your local cuda build.
For example,
cupy
providespip install cupy-cuda12x
.jax
providespip install "jax[cuda12_local]"
and as far as I'm aware,pip install tensorflow
also appears to use the GPU even if I don't specifypip install "tensorflow[and-cuda]"
which could install the nvidia/cuda wheels as well.Please close if this is just not possible in pytorch's case or a duplicate (I didn't see it if it's there).
Alternatives
Just have the available space and install the nvidia wheels on every environment separately.
Additional context
No response
cc @seemethere @malfet @osalpekar @atalman @pytorch/pytorch-dev-infra
The text was updated successfully, but these errors were encountered: