Thanks to visit codestin.com
Credit goes to github.com

Skip to content

CUDA - Incorrect code generated when -O1 or more is enabled #59156

@carlosgalvezp

Description

@carlosgalvezp

Hi,

We notice the following minimal example code produces incorrect result when using -O1 or later:

#include <cstdio>

struct Member
{
    int value;
    bool valid{}; // <<<< Removing {} makes it work
};

class Wrapper
{
 public:
    __host__ __device__ Wrapper(){}
    __host__ __device__ explicit Wrapper(Member member) :
        member_(member)
    {}
    Member member_{};
};

__device__ inline Member makeMember(int const input)  //<<<<< Removing inline makes it work
{
    return Member{input, true};
}

__global__ void testKernel(int const input, Wrapper* const d_wrapper)
{
    auto const x = makeMember(input);
    *d_wrapper = Wrapper(x);
    printf("Wrapper member value: %d\n", d_wrapper->member_.value);
    printf("Member value: %d\n", x.value);
}

int main()
{
    Wrapper* d_wrapper{};
    cudaMalloc(&d_wrapper, sizeof(Wrapper));
    testKernel<<<1, 1>>>(123, d_wrapper);
    cudaDeviceSynchronize();
}

Compiled with:

clang --cuda-path=/usr/local/cuda-11.7 -O3 -std=c++14 --cuda-gpu-arch=sm_75 main.cu -L /usr/local/cuda-11.7/lib64/ -lcudart_static -ldl -lpthread -lrt

Expected result:

Wrapper member value: 123
Member value: 123

Actual result:

Wrapper member value: 0
Member value: 123

NVCC produces correct results, on the other hand. Is this a compiler bug, or some hidden bug in our code?

Thanks!

Metadata

Metadata

Assignees

Labels

bugIndicates an unexpected problem or unintended behaviorcudallvm:instcombineCovers the InstCombine, InstSimplify and AggressiveInstCombine passes

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions