conv2d with int8 on CUDA: GET was unable to find an engine to execute this computation #152992

c-f-h · 2025-05-06T21:09:28Z

🐛 Describe the bug

The following script works fine if I switch to CPU, or change the tensor dtypes to float32. Otherwise, see the error below.

import torch

device = torch.device("cuda")         # works fine with "cpu"
print(f"Using device: {device}")

# works fine if both are float32
input  = torch.randint(low=0, high=2, size=(1, 1, 6, 6), dtype=torch.int8).to(device)
kernel = torch.randint(low=0, high=2, size=(1, 1, 3, 3), dtype=torch.int8).to(device)

output = torch.nn.functional.conv2d(input, kernel, padding=1)
print("Convolution successful. Output shape:", output.shape)

Traceback:

Using device: cuda
Traceback (most recent call last):
  File "C:\Users\Clemens\prog\cuda-conv-int8.py", line 10, in <module>
    output = torch.nn.functional.conv2d(input, kernel, padding=1)
RuntimeError: GET was unable to find an engine to execute this computation

Versions

PyTorch version: 2.7.0+cu128
Is debug build: False
CUDA used to build PyTorch: 12.8
ROCM used to build PyTorch: N/A

OS: Microsoft Windows 10 Home (10.0.19045 64-bit)
GCC version: Could not collect
Clang version: Could not collect
CMake version: Could not collect
Libc version: N/A

Python version: 3.13.3 (tags/v3.13.3:6280bb5, Apr 8 2025, 14:47:33) [MSC v.1943 64 bit (AMD64)] (64-bit runtime)
Python platform: Windows-10-10.0.19045-SP0
Is CUDA available: True
CUDA runtime version: 12.8.93
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: NVIDIA GeForce GTX 1060 6GB
Nvidia driver version: 572.83
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Name: Intel(R) Core(TM) i5-4460 CPU @ 3.20GHz
Manufacturer: GenuineIntel
Family: 1
Architecture: 9
ProcessorType: 3
DeviceID: CPU0
CurrentClockSpeed: 3201
MaxClockSpeed: 3201
L2CacheSize: 1024
L2CacheSpeed: None
Revision: 15363

Versions of relevant libraries:
[pip3] numpy==2.1.2
[pip3] torch==2.7.0+cu128
[pip3] torchaudio==2.7.0+cu128
[pip3] torchvision==0.22.0+cu128
[conda] Could not collect

cc @ptrblck @msaroufim @eqy @jerryzh168

The text was updated successfully, but these errors were encountered:

Aidyn-A · 2025-05-07T09:38:43Z

@eqy, does CUDNN support int8 dtype?

eqy · 2025-05-07T17:25:25Z

I'm not sure this is supported, though we should error out earlier than this point if that's the case.

In the meantime @c-f-h have you also tried going through the quantized conv2d op explicitly?

pytorch/test/quantization/core/test_quantized_op.py

Line 5275 in 172e641

qconv = torch.ops.quantized.conv2d

janeyx99 added module: cuda Related to torch.cuda, and CUDA support in general module: convolution Problems related to convolutions (THNN, THCUNN, CuDNN) triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels May 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

conv2d with int8 on CUDA: GET was unable to find an engine to execute this computation #152992

conv2d with int8 on CUDA: GET was unable to find an engine to execute this computation #152992

c-f-h commented May 6, 2025 •

edited by pytorch-bot bot

Loading

Aidyn-A commented May 7, 2025

eqy commented May 7, 2025

conv2d with int8 on CUDA: GET was unable to find an engine to execute this computation #152992

conv2d with int8 on CUDA: GET was unable to find an engine to execute this computation #152992

Comments

c-f-h commented May 6, 2025 • edited by pytorch-bot bot Loading

🐛 Describe the bug

Versions

Aidyn-A commented May 7, 2025

eqy commented May 7, 2025

c-f-h commented May 6, 2025 •

edited by pytorch-bot bot

Loading