-
Notifications
You must be signed in to change notification settings - Fork 75k
Open
Labels
type:bugBugBug
Description
Issue type
Bug
Have you reproduced the bug with TensorFlow Nightly?
No
Source
source
TensorFlow version
tf 2.20
Custom code
Yes
OS platform and distribution
Ubuntu 22.04
Mobile device
No response
Python version
3.10
Bazel version
No response
GCC/compiler version
No response
CUDA/cuDNN version
CUDA 12.5.1, cuDNN 9.2.1
GPU model and memory
No response
Current behavior?
Compute-Sanitizer reports an out of bounds read on MatrixDiagPartKernel
Standalone code to reproduce the issue
# MatrixDiagPartOp
import tensorflow as tf
gpus = tf.config.list_physical_devices('GPU')
assert gpus, 'No GPU found'
tf.config.experimental.set_memory_growth(gpus[0], True)
with tf.device("/GPU:0"):
# values truncated in crash log; using uniform fill to match declared shape
input = tf.ones([46341,46341], dtype=tf.float32)
tf.raw_ops.MatrixDiagPart(input=input)Relevant log output
========= Invalid __global__ read of size 4 bytes
========= at void tensorflow::functor::MatrixDiagPartKernel<float>(int, int, int, int, int, int, int, T1, bool, bool, const T1 *, T1 *)+0x690
========= by thread (260,0,0) in block (45,0,0)
========= Address 0x7f3f80004860 is out of bounds
========= and is 8589916064 bytes before the nearest allocation at 0x7f4180000000 of size 17179869184 bytes
========= Saved host backtrace up to driver entry point at kernel launch timeMetadata
Metadata
Assignees
Labels
type:bugBugBug