Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[CUDA] illegal memory read on MatrixDiagPart #104363

@kokol16

Description

@kokol16

Issue type

Bug

Have you reproduced the bug with TensorFlow Nightly?

No

Source

source

TensorFlow version

tf 2.20

Custom code

Yes

OS platform and distribution

Ubuntu 22.04

Mobile device

No response

Python version

3.10

Bazel version

No response

GCC/compiler version

No response

CUDA/cuDNN version

CUDA 12.5.1, cuDNN 9.2.1

GPU model and memory

No response

Current behavior?

Compute-Sanitizer reports an out of bounds read on MatrixDiagPartKernel

Standalone code to reproduce the issue

# MatrixDiagPartOp

import tensorflow as tf

gpus = tf.config.list_physical_devices('GPU')

assert gpus, 'No GPU found'

tf.config.experimental.set_memory_growth(gpus[0], True)

with tf.device("/GPU:0"):
    # values truncated in crash log; using uniform fill to match declared shape
    input = tf.ones([46341,46341], dtype=tf.float32)
    tf.raw_ops.MatrixDiagPart(input=input)

Relevant log output

========= Invalid __global__ read of size 4 bytes
=========     at void tensorflow::functor::MatrixDiagPartKernel<float>(int, int, int, int, int, int, int, T1, bool, bool, const T1 *, T1 *)+0x690
=========     by thread (260,0,0) in block (45,0,0)
=========     Address 0x7f3f80004860 is out of bounds
=========     and is 8589916064 bytes before the nearest allocation at 0x7f4180000000 of size 17179869184 bytes
=========     Saved host backtrace up to driver entry point at kernel launch time

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions