Thanks to visit codestin.com
Credit goes to github.com

Skip to content

fused_conv_bias_relu fails with channels-first (NCHW): MIOpen workspace size error #309

@mjkvaak-amd

Description

@mjkvaak-amd

Describe the Bug

ConvBias(ReLU) fails with channels-first layout. The error is:

MIOpen Error: 83cd7bbb1465:/rocm-libraries/projects/miopen/src/fusion.cpp:1224: The provided workspace size is less than required. Req:328859648 Given:0
MIOpen error: 3

Minimal Steps/Code to Reproduce the Bug

import sys
import torch

try:
    import fused_conv_bias_relu
except ImportError as e:
    print("SKIP: Apex fused ops not available:", e)
    sys.exit(0)

# Raw Apex API: forward([x, weight, bias], padding, stride) -> list[tensor]
def run(name, x, w, b, pad=1, stride=1):
    try:
        fused_conv_bias_relu.forward([x, w, b], pad, stride)
        fused_conv_bias_relu.forward_no_relu([x, w, b], pad, stride)
        print(f"  {name}: OK")
        return True
    except Exception as e:
        print(f"  {name}: FAIL — {e}")
        return False

def main():
    bs, c, hw = 2, 256, 7
    x_nchw = torch.rand(bs, c, hw, hw, dtype=torch.half).cuda()
    x_nhwc = x_nchw.to(memory_format=torch.channels_last)
    w = torch.rand(256, c, 3, 3, dtype=torch.half).cuda().to(memory_format=torch.channels_last)
    b = torch.rand(1, 256, 1, 1, dtype=torch.half).cuda().to(memory_format=torch.channels_last)

    print("ConvBiasReLU + ConvBias layout test (fp16, bs=2, 256x7x7):")
    run("channels_last (NHWC)", x_nhwc, w, b)
    run("channels_first (NCHW)", x_nchw, w, b)

if __name__ == "__main__":
    main()

Expected Behavior

Either fix channels-first, or add a check for layout and raise an error for the unsupported channels-first layout.

Environment

OS:

PRETTY_NAME="Ubuntu 24.04.3 LTS"
NAME="Ubuntu"
VERSION_ID="24.04"
VERSION="24.04.3 LTS (Noble Numbat)"
VERSION_CODENAME=noble

GPU:

====================================== Product Info ======================================
GPU[0]          : Card Series:          AMD Instinct MI355X
GPU[0]          : Card Model:           0x75a3
GPU[0]          : Card Vendor:          Advanced Micro Devices, Inc. [AMD/ATI]
GPU[0]          : Card SKU:             N/A
GPU[0]          : Subsystem ID:         0x75a3
GPU[0]          : Device Rev:           0x00
GPU[0]          : Node ID:              3
GPU[0]          : GUID:                 42583
GPU[0]          : GFX Version:          gfx950
(GPU[1-7] are the same)

ROCm: 7.1.1

Python: 3.12.3

Apex: 5921107

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions