-
Notifications
You must be signed in to change notification settings - Fork 25.5k
Description
π Describe the bug
When converting a cpu tensor to mps, the mps tensor is not the same if the total size is equal to or over 4GiB.
import torch
t = torch.ones((2**30-1,), dtype=torch.float32)
t2 = t.to("mps")
print("CPU <4GiB:", (t == torch.tensor(1)).all())
print("MPS <4GiB:", (t2 == torch.tensor(1, device="mps")).all())
print()
t = torch.ones((2**30,), dtype=torch.float32)
t2 = t.to("mps")
print("CPU 4GiB:", (t == torch.tensor(1)).all())
print("MPS 4GiB:", (t2 == torch.tensor(1, device="mps")).all())
Output:
CPU <4GiB: tensor(True)
MPS <4GiB: tensor(True, device='mps:0')
CPU 4GiB: tensor(True)
MPS 4GiB: tensor(False, device='mps:0')
Versions
PyTorch version: 2.2.2
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A
OS: macOS 14.4.1 (arm64)
GCC version: Could not collect
Clang version: 15.0.0 (clang-1500.3.9.4)
CMake version: version 3.29.0
Libc version: N/A
Python version: 3.11.8 (main, Apr 4 2024, 20:29:28) [Clang 15.0.0 (clang-1500.3.9.4)] (64-bit runtime)
Python platform: macOS-14.4.1-arm64-arm-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
CPU:
Apple M3 Max
Versions of relevant libraries:
[pip3] numpy==1.26.4
[pip3] torch==2.2.2
[conda] Could not collect
cc @ezyang @gchanan @zou3519 @kadeng @mruberry @kulinseth @albanD @malfet @DenisVieriu97 @jhavukainen