[CodeGenPrepare] Make sure instruction get from SunkAddrs is before MemoryInst #139303

weiguozhi · 2025-05-09T18:20:38Z

Function optimizeBlock may do optimizations on a block for multiple times. In the first iteration of the loop, MemoryInst1 may generate a sunk instruction and store it into SunkAddrs. In the second iteration of the loop, MemoryInst2 may use the same address and then it can reuse the sunk instruction stored in SunkAddrs, but MemoryInst2 may be before MemoryInst1 and the corresponding sunk instruction. In order to avoid use before def error, we need to move the sunk instruction before MemoryInst2.

Fixes #138208.

…emoryInst Function optimizeBlock may do optimizations on a block for multiple times. In the first iteration of the loop, MemoryInst1 may generate a sunk instruction and store it into SunkAddrs. In the second iteration of the loop, MemoryInst2 may use the same address and then it can reuse the sunk instruction stored in SunkAddrs, but MemoryInst2 may be before MemoryInst1 and the corresponding sunk instruction. In order to avoid use before def error, we need to move the sunk instruction before MemoryInst2. It fixes issue 138208.

llvmbot · 2025-05-09T18:21:15Z

@llvm/pr-subscribers-llvm-transforms

Author: None (weiguozhi)

Changes

Function optimizeBlock may do optimizations on a block for multiple times. In the first iteration of the loop, MemoryInst1 may generate a sunk instruction and store it into SunkAddrs. In the second iteration of the loop, MemoryInst2 may use the same address and then it can reuse the sunk instruction stored in SunkAddrs, but MemoryInst2 may be before MemoryInst1 and the corresponding sunk instruction. In order to avoid use before def error, we need to move the sunk instruction before MemoryInst2.

It fixes issue 138208.

Full diff: https://github.com/llvm/llvm-project/pull/139303.diff

2 Files Affected:

(modified) llvm/lib/CodeGen/CodeGenPrepare.cpp (+3)
(added) llvm/test/Transforms/CodeGenPrepare/X86/sink-addr-reuse.ll (+44)

diff --git a/llvm/lib/CodeGen/CodeGenPrepare.cpp b/llvm/lib/CodeGen/CodeGenPrepare.cpp
index f9dcb472ed1d2..9d491120dcb39 100644
--- a/llvm/lib/CodeGen/CodeGenPrepare.cpp
+++ b/llvm/lib/CodeGen/CodeGenPrepare.cpp
@@ -5913,6 +5913,9 @@ bool CodeGenPrepare::optimizeMemoryInst(Instruction *MemoryInst, Value *Addr,
   if (SunkAddr) {
     LLVM_DEBUG(dbgs() << "CGP: Reusing nonlocal addrmode: " << AddrMode
                       << " for " << *MemoryInst << "\n");
+    Instruction *AddrInst = dyn_cast<Instruction>(SunkAddr);
+    if (AddrInst && MemoryInst->comesBefore(AddrInst))
+      AddrInst->moveBefore(MemoryInst->getIterator());
     if (SunkAddr->getType() != Addr->getType()) {
       if (SunkAddr->getType()->getPointerAddressSpace() !=
               Addr->getType()->getPointerAddressSpace() &&
diff --git a/llvm/test/Transforms/CodeGenPrepare/X86/sink-addr-reuse.ll b/llvm/test/Transforms/CodeGenPrepare/X86/sink-addr-reuse.ll
new file mode 100644
index 0000000000000..019f311406550
--- /dev/null
+++ b/llvm/test/Transforms/CodeGenPrepare/X86/sink-addr-reuse.ll
@@ -0,0 +1,44 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S -p 'require<profile-summary>,codegenprepare' -cgpp-huge-func=0 < %s | FileCheck %s
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-grtev4-linux-gnu"
+
+declare void @g(ptr)
+
+; %load and %load5 use the same address, %load5 is optimized first, %load is
+; optimized later and reuse the same address computation instruction. We must
+; make sure not to generate use before def error.
+
+define void @f(ptr %arg) {
+; CHECK-LABEL: define void @f(
+; CHECK-SAME: ptr [[ARG:%.*]]) {
+; CHECK-NEXT:  [[BB:.*:]]
+; CHECK-NEXT:    [[GETELEMENTPTR:%.*]] = getelementptr i8, ptr [[ARG]], i64 -64
+; CHECK-NEXT:    call void @g(ptr [[GETELEMENTPTR]])
+; CHECK-NEXT:    [[SUNKADDR1:%.*]] = getelementptr i8, ptr [[ARG]], i64 -64
+; CHECK-NEXT:    [[LOAD:%.*]] = load ptr, ptr [[SUNKADDR1]], align 8
+; CHECK-NEXT:    [[SUNKADDR:%.*]] = getelementptr i8, ptr [[ARG]], i64 -56
+; CHECK-NEXT:    [[LOAD4:%.*]] = load i32, ptr [[SUNKADDR]], align 8
+; CHECK-NEXT:    [[LOAD5:%.*]] = load ptr, ptr [[SUNKADDR1]], align 8
+; CHECK-NEXT:    [[TMP0:%.*]] = call { i32, i1 } @llvm.uadd.with.overflow.i32(i32 1, i32 0)
+; CHECK-NEXT:    [[MATH:%.*]] = extractvalue { i32, i1 } [[TMP0]], 0
+; CHECK-NEXT:    ret void
+;
+bb:
+  %getelementptr = getelementptr i8, ptr %arg, i64 -64
+  %getelementptr1 = getelementptr i8, ptr %arg, i64 -56
+  call void @g(ptr %getelementptr)
+  br label %bb3
+
+bb3:
+  %load = load ptr, ptr %getelementptr, align 8
+  %load4 = load i32, ptr %getelementptr1, align 8
+  %load5 = load ptr, ptr %getelementptr, align 8
+  %add = add i32 1, 0
+  %icmp = icmp eq i32 %add, 0
+  br i1 %icmp, label %bb7, label %bb7
+
+bb7:
+  ret void
+}

arsenm · 2025-05-12T12:18:48Z

llvm/lib/CodeGen/CodeGenPrepare.cpp

+    Instruction *AddrInst = dyn_cast<Instruction>(SunkAddr);
+    if (AddrInst && MemoryInst->comesBefore(AddrInst))


This is a bit brute force, is there another way to get the implied correct insert point? e.g. change the insertion point above instead of MemoryInst

I added a function to find the appropriate insert position for a sunk address instruction.

rnk · 2025-05-12T20:51:22Z

In the second iteration of the loop, MemoryInst2 may use the same address

This sounds like an ABA problem, like use-after-free. The use of ValueMap should prevent this, right? Is the problem really address reuse? I just want to make sure that we don't have stale entries in the ValueMap.

weiguozhi · 2025-05-12T21:36:05Z

In the second iteration of the loop, MemoryInst2 may use the same address

This sounds like an ABA problem, like use-after-free. The use of ValueMap should prevent this, right? Is the problem really address reuse? I just want to make sure that we don't have stale entries in the ValueMap.

It's not an ABA, just AB. With the following sequence, both memory instructions use the same address

mem1 addr
mem2 addr

The original implementation assumes mem1 is always handled before mem2, so when a sunkaddr instructions is inserted at the first time, it is inserted before mem1 and stored in SunkAddrs, when later mem2 is handled, it can simply reuse the previously inserted sunkaddr from SunkAddrs.

sunkaddr =
mem1 sunkaddr
mem2 sunkaddr

But #138208 shows mem2 actually can be handled before mem1 because the same BB is processed multiple times, so after the first iteration of the BB, we got

mem1 addr
sunkaddr =
mem2 sunkaddr

In the second iteration of BB, it detects mem1 can also be optimized, and from SunkAddrs it found the sunk instruction of addr is available and uses it directly, unfortunately the sunkaddr is below mem1.

mem1 sunkaddr
sunkaddr =
mem2 sunkaddr

So the problem is the following assumption is not correct. We need to find more appropriate insert position for the new sunkaddr instruction.

  // Insert this computation right after this user.  Since our caller is
  // scanning from the top of the BB to the bottom, reuse of the expr are
  // guaranteed to happen later.
  IRBuilder<> Builder(MemoryInst);

instead of just before MemoryInst.

rnk

Right, got it, it's not stale data, it's that we don't actually always iterate forward.

Let's go ahead and prioritize landing this, since it's a correctness fix.

llvmbot added the llvm:transforms label May 9, 2025

alexfh requested review from arsenm and nikic May 12, 2025 11:52

arsenm reviewed May 12, 2025

View reviewed changes

Find an appropriate insert position for a sunk address instruction

a750890

instead of just before MemoryInst.

rnk approved these changes May 14, 2025

View reviewed changes

weiguozhi merged commit 59c6d70 into llvm:main May 15, 2025
10 of 11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CodeGenPrepare] Make sure instruction get from SunkAddrs is before MemoryInst #139303

[CodeGenPrepare] Make sure instruction get from SunkAddrs is before MemoryInst #139303

weiguozhi commented May 9, 2025 •

edited by arsenm

Loading

llvmbot commented May 9, 2025

arsenm May 12, 2025

weiguozhi May 14, 2025

rnk commented May 12, 2025

weiguozhi commented May 12, 2025

rnk left a comment

		Instruction *AddrInst = dyn_cast<Instruction>(SunkAddr);
		if (AddrInst && MemoryInst->comesBefore(AddrInst))

[CodeGenPrepare] Make sure instruction get from SunkAddrs is before MemoryInst #139303

[CodeGenPrepare] Make sure instruction get from SunkAddrs is before MemoryInst #139303

Conversation

weiguozhi commented May 9, 2025 • edited by arsenm Loading

llvmbot commented May 9, 2025

arsenm May 12, 2025

Choose a reason for hiding this comment

weiguozhi May 14, 2025

Choose a reason for hiding this comment

rnk commented May 12, 2025

weiguozhi commented May 12, 2025

rnk left a comment

Choose a reason for hiding this comment

weiguozhi commented May 9, 2025 •

edited by arsenm

Loading