4: HW Exception by GPU node-4 (Agent handle: 0x61372ad74ae0) reason :GPU Hang
8: HW Exception by GPU node-4 (Agent handle: 0x56387a41ea30) reason :GPU Hang
shared/rocroller/test/catch/GlobalLoadStoreTest.cpp:105: FAILED:
{Unknown expression after the reported line}
due to a fatal error condition:
SIGABRT - Abort (abnormal termination) signal
1/2 Test #4: rocroller-tests_full_suite .........Subprocess aborted***Exception: 113.49 sec
2/2 Test #8: rocroller-tests-catch_full_suite ...Subprocess aborted***Exception: 113.50 sec
0% tests passed, 2 tests failed out of 2
Summary
Test rocroller (shard 4/5)ongfx94X-dcgpufailed in bump PR #5556 with a GPU hardware exception that simultaneously aborted both concurrent rocroller ctest processes. The rocroller test binary is unchanged; onlyrocm-systemswas bumped (18 commits,b8c378e→84ed18c).Context
gfx94X-dcgpu(gfx942, MI300X)linux-gfx942-1gpu-core42-ossci-rocmWindowsRunners
Full logs:
Failure signature