Thanks to visit codestin.com
Credit goes to github.com

Skip to content

MPS: Conv1d fails with NotImplementedError for output_channels > 65536 #152278

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ehartford opened this issue Apr 27, 2025 · 4 comments
Open

MPS: Conv1d fails with NotImplementedError for output_channels > 65536 #152278

ehartford opened this issue Apr 27, 2025 · 4 comments
Labels
module: convolution Problems related to convolutions (THNN, THCUNN, CuDNN) module: mps Related to Apple Metal Performance Shaders framework triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Milestone

Comments

@ehartford
Copy link

ehartford commented Apr 27, 2025

πŸ› Describe the bug

Running torch.nn.functional.conv1d (or torch.nn.Conv1d) on the MPS backend results in the following error when the number of output channels exceeds 65536:

NotImplementedError: Output channels > 65536 not supported at the MPS device.

This limitation prevents certain common model architectures, such as standard Wav2Vec2 implementations which utilize Conv1d layers with high channel counts in their feature extraction components, from running natively on the MPS device.

The current workaround involves either using the global PYTORCH_ENABLE_MPS_FALLBACK=1 environment variable or implementing targeted code changes to move the specific conv1d operation and its inputs/outputs to the CPU, both of which negatively impact performance compared to native MPS execution.

Please consider adding support for conv1d operations with output channels > 65536 on the MPS backend to improve hardware acceleration coverage and performance for models relying on such layers.

Reproduce:

import torch
import torch.nn.functional as F

# Check for MPS availability
if not torch.backends.mps.is_available():
    print("MPS device not available. This snippet requires an Apple Silicon Mac with PyTorch built with MPS support.")
    exit()

if not torch.backends.mps.is_built():
    print("PyTorch was not built with MPS support. This snippet requires an Apple Silicon Mac with PyTorch built with MPS support.")
    exit()

device = torch.device("mps")
print(f"Using device: {device}")

# Define parameters
batch_size = 1
in_channels = 1
length = 1024
out_channels_problematic = 65537 # > 65536
kernel_size = 3

# Create input and weight tensors
try:
    input_tensor = torch.randn(batch_size, in_channels, length, device=device)
    # Weight shape: (out_channels, in_channels, kernel_size)
    weight_tensor = torch.randn(out_channels_problematic, in_channels, kernel_size, device=device)

    print(f"Input tensor shape: {input_tensor.shape}, device: {input_tensor.device}")
    print(f"Weight tensor shape: {weight_tensor.shape}, device: {weight_tensor.device}")

    # Attempt the problematic conv1d operation
    print(f"\nAttempting F.conv1d with out_channels={out_channels_problematic}...")
    output = F.conv1d(input_tensor, weight_tensor)
    print("Operation succeeded unexpectedly.") # Should not reach here

except NotImplementedError as e:
    print(f"\nSuccessfully reproduced the expected error:")
    print(f"  Type: {type(e)}")
    print(f"  Message: {e}")
except Exception as e:
    print(f"\nCaught an unexpected error:")
    print(f"  Type: {type(e)}")
    print(f"  Message: {e}")

Environment:

PyTorch Version: 2.5.1
macOS Version: Sequoiq 15.4.1
Hardware: Apple Silicon (M-series chip)

Versions

Collecting environment information...
PyTorch version: 2.5.1
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: macOS 15.4.1 (arm64)
GCC version: Could not collect
Clang version: 16.0.0 (clang-1600.0.26.6)
CMake version: version 3.31.1
Libc version: N/A

Python version: 3.10.13 | packaged by conda-forge | (main, Dec 23 2023, 15:35:25) [Clang 16.0.6 ] (64-bit runtime)
Python platform: macOS-15.4.1-arm64-arm-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Apple M4 Max

Versions of relevant libraries:
[pip3] mypy_extensions==1.1.0
[pip3] numpy==1.26.4
[pip3] onnx==1.17.0
[pip3] onnx-weekly==1.19.0.dev20250425
[pip3] onnx2torch==1.5.15
[pip3] onnx2torch-py313==1.6.0
[pip3] onnxruntime==1.21.1
[pip3] pytorch-wpe==0.0.1
[pip3] rotary-embedding-torch==0.6.5
[pip3] torch==2.5.1
[pip3] torch-complex==0.4.4
[pip3] torchaudio==2.5.1
[pip3] torchvision==0.20.1
[conda] libopenvino-pytorch-frontend 2025.0.0 h286801f_3 conda-forge
[conda] numpy 1.26.4 pypi_0 pypi
[conda] onnx2torch 1.5.15 pypi_0 pypi
[conda] onnx2torch-py313 1.6.0 pypi_0 pypi
[conda] pytorch-wpe 0.0.1 pypi_0 pypi
[conda] rotary-embedding-torch 0.6.5 pypi_0 pypi
[conda] torch 2.5.1 pypi_0 pypi
[conda] torch-complex 0.4.4 pypi_0 pypi
[conda] torchaudio 2.5.1 pypi_0 pypi
[conda] torchvision 0.20.1 pypi_0 pypi

cc @kulinseth @albanD @malfet @DenisVieriu97 @jhavukainen

@malfet malfet added module: convolution Problems related to convolutions (THNN, THCUNN, CuDNN) triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module module: mps Related to Apple Metal Performance Shaders framework labels Apr 27, 2025
@malfet malfet added this to the 2.8.0 milestone Apr 27, 2025
@malfet
Copy link
Contributor

malfet commented Apr 27, 2025

@skotapati I remember you've worked on a solution that just slices the tensor, didn't you?

@jhavukainen
Copy link
Collaborator

The error here should only get raised if the OS version is determined to be < 15.1 on torch==2.6.0 onwards as the support of output_dim was extended in MacOS15.1. @ehartford would you be able to upgrade your PyTorch version to 2.6.0 or newer to get the op support? Or more accurately to get rid of the assert preventing your op from having support.

@ehartford
Copy link
Author

I will try, thank you

@jhavukainen
Copy link
Collaborator

Great, thanks! Please let us know if you still see the problem there or if we can close the issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: convolution Problems related to convolutions (THNN, THCUNN, CuDNN) module: mps Related to Apple Metal Performance Shaders framework triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

3 participants