Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[GVN] MemorySSA for GVN: embed the memory state in symbolic expressions #123218

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

antoniofrighetto
Copy link
Contributor

While migrating towards MemorySSA, account for the memory state modeled by MemorySSA by hashing it, when computing the symbolic expressions for the memory operations. Likewise, when phi-translating while walking the CFG for PRE possibilities, see if the value number of an operand may be refined with one of the value from the incoming edges of the MemoryPhi associated to the current phi.


Original patch: https://reviews.llvm.org/D115160.
Minor additions wrt the original version encompass comments and not including the refactoring of createCallExpr (now it should be taking into account the attributes of the call-site as well, not sure if it’s worth the refactor AFAICT).

@llvmbot
Copy link
Member

llvmbot commented Jan 16, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Antonio Frighetto (antoniofrighetto)

Changes

While migrating towards MemorySSA, account for the memory state modeled by MemorySSA by hashing it, when computing the symbolic expressions for the memory operations. Likewise, when phi-translating while walking the CFG for PRE possibilities, see if the value number of an operand may be refined with one of the value from the incoming edges of the MemoryPhi associated to the current phi.


Original patch: https://reviews.llvm.org/D115160.
Minor additions wrt the original version encompass comments and not including the refactoring of createCallExpr (now it should be taking into account the attributes of the call-site as well, not sure if it’s worth the refactor AFAICT).


Full diff: https://github.com/llvm/llvm-project/pull/123218.diff

2 Files Affected:

  • (modified) llvm/include/llvm/Transforms/Scalar/GVN.h (+8)
  • (modified) llvm/lib/Transforms/Scalar/GVN.cpp (+87-6)
diff --git a/llvm/include/llvm/Transforms/Scalar/GVN.h b/llvm/include/llvm/Transforms/Scalar/GVN.h
index c8be390799836e..dd47cc917370e2 100644
--- a/llvm/include/llvm/Transforms/Scalar/GVN.h
+++ b/llvm/include/llvm/Transforms/Scalar/GVN.h
@@ -172,6 +172,10 @@ class GVNPass : public PassInfoMixin<GVNPass> {
     // Value number to PHINode mapping. Used for phi-translate in scalarpre.
     DenseMap<uint32_t, PHINode *> NumberingPhi;
 
+    // Value number to BasicBlock mapping. Used for phi-translate across
+    // MemoryPhis.
+    DenseMap<uint32_t, BasicBlock *> NumberingBB;
+
     // Cache for phi-translate in scalarpre.
     using PhiTranslateMap =
         DenseMap<std::pair<uint32_t, const BasicBlock *>, uint32_t>;
@@ -179,6 +183,7 @@ class GVNPass : public PassInfoMixin<GVNPass> {
 
     AAResults *AA = nullptr;
     MemoryDependenceResults *MD = nullptr;
+    MemorySSA *MSSA = nullptr;
     DominatorTree *DT = nullptr;
 
     uint32_t nextValueNumber = 1;
@@ -189,12 +194,14 @@ class GVNPass : public PassInfoMixin<GVNPass> {
     Expression createExtractvalueExpr(ExtractValueInst *EI);
     Expression createGEPExpr(GetElementPtrInst *GEP);
     uint32_t lookupOrAddCall(CallInst *C);
+    uint32_t lookupOrAddLoadStore(Instruction *I);
     uint32_t phiTranslateImpl(const BasicBlock *BB, const BasicBlock *PhiBlock,
                               uint32_t Num, GVNPass &Gvn);
     bool areCallValsEqual(uint32_t Num, uint32_t NewNum, const BasicBlock *Pred,
                           const BasicBlock *PhiBlock, GVNPass &Gvn);
     std::pair<uint32_t, bool> assignExpNewValueNum(Expression &exp);
     bool areAllValsInBB(uint32_t num, const BasicBlock *BB, GVNPass &Gvn);
+    void addMemoryStateToExp(Instruction *I, Expression &E);
 
   public:
     ValueTable();
@@ -217,6 +224,7 @@ class GVNPass : public PassInfoMixin<GVNPass> {
     void setAliasAnalysis(AAResults *A) { AA = A; }
     AAResults *getAliasAnalysis() const { return AA; }
     void setMemDep(MemoryDependenceResults *M) { MD = M; }
+    void setMemorySSA(MemorySSA *M) { MSSA = M; }
     void setDomTree(DominatorTree *D) { DT = D; }
     uint32_t getNextUnusedValueNumber() { return nextValueNumber; }
     void verifyRemoved(const Value *) const;
diff --git a/llvm/lib/Transforms/Scalar/GVN.cpp b/llvm/lib/Transforms/Scalar/GVN.cpp
index 31af2d8a617b63..b0c01bc31c0a8f 100644
--- a/llvm/lib/Transforms/Scalar/GVN.cpp
+++ b/llvm/lib/Transforms/Scalar/GVN.cpp
@@ -476,6 +476,27 @@ void GVNPass::ValueTable::add(Value *V, uint32_t num) {
     NumberingPhi[num] = PN;
 }
 
+// Include the incoming memory state into the hash of the expression for the
+// given instruction. If the incoming memory state is:
+// * LiveOnEntry, add the value number of the entry block,
+// * a MemoryPhi, add the value number of the basic block corresponding to that
+// MemoryPhi,
+// * a MemoryDef, add the value number of the memory setting instruction.
+void GVNPass::ValueTable::addMemoryStateToExp(Instruction *I, Expression &E) {
+  assert(MSSA && "addMemoryStateToExp should not be called without MemorySSA");
+  assert(MSSA->getMemoryAccess(I) && "Instruction does not access memory");
+  MemoryAccess *MA = MSSA->getSkipSelfWalker()->getClobberingMemoryAccess(I);
+
+  uint32_t N = 0;
+  if (isa<MemoryPhi>(MA))
+    N = lookupOrAdd(MA->getBlock());
+  else if (MSSA->isLiveOnEntryDef(MA))
+    N = lookupOrAdd(&I->getFunction()->getEntryBlock());
+  else
+    N = lookupOrAdd(cast<MemoryDef>(MA)->getMemoryInst());
+  E.varargs.push_back(N);
+}
+
 uint32_t GVNPass::ValueTable::lookupOrAddCall(CallInst *C) {
   // FIXME: Currently the calls which may access the thread id may
   // be considered as not accessing the memory. But this is
@@ -596,10 +617,37 @@ uint32_t GVNPass::ValueTable::lookupOrAddCall(CallInst *C) {
     return v;
   }
 
+  if (MSSA && AA->onlyReadsMemory(C)) {
+    Expression exp = createExpr(C);
+    addMemoryStateToExp(C, exp);
+    uint32_t e = assignExpNewValueNum(exp).first;
+    valueNumbering[C] = e;
+    return e;
+  }
+
   valueNumbering[C] = nextValueNumber;
   return nextValueNumber++;
 }
 
+/// Returns the value number for the specified load or store instruction.
+uint32_t GVNPass::ValueTable::lookupOrAddLoadStore(Instruction *I) {
+  if (!MSSA) {
+    valueNumbering[I] = nextValueNumber;
+    return nextValueNumber++;
+  }
+
+  Expression E;
+  E.type = I->getType();
+  E.opcode = I->getOpcode();
+  for (Use &Op : I->operands())
+    E.varargs.push_back(lookupOrAdd(Op));
+  addMemoryStateToExp(I, E);
+
+  uint32_t N = assignExpNewValueNum(E).first;
+  valueNumbering[I] = N;
+  return N;
+}
+
 /// Returns true if a value number exists for the specified value.
 bool GVNPass::ValueTable::exists(Value *V) const {
   return valueNumbering.contains(V);
@@ -615,6 +663,8 @@ uint32_t GVNPass::ValueTable::lookupOrAdd(Value *V) {
   auto *I = dyn_cast<Instruction>(V);
   if (!I) {
     valueNumbering[V] = nextValueNumber;
+    if (MSSA && isa<BasicBlock>(V))
+      NumberingBB[nextValueNumber] = cast<BasicBlock>(V);
     return nextValueNumber++;
   }
 
@@ -674,6 +724,9 @@ uint32_t GVNPass::ValueTable::lookupOrAdd(Value *V) {
       valueNumbering[V] = nextValueNumber;
       NumberingPhi[nextValueNumber] = cast<PHINode>(V);
       return nextValueNumber++;
+    case Instruction::Load:
+    case Instruction::Store:
+      return lookupOrAddLoadStore(I);
     default:
       valueNumbering[V] = nextValueNumber;
       return nextValueNumber++;
@@ -711,6 +764,7 @@ void GVNPass::ValueTable::clear() {
   valueNumbering.clear();
   expressionNumbering.clear();
   NumberingPhi.clear();
+  NumberingBB.clear();
   PhiTranslateTable.clear();
   nextValueNumber = 1;
   Expressions.clear();
@@ -725,6 +779,8 @@ void GVNPass::ValueTable::erase(Value *V) {
   // If V is PHINode, V <--> value number is an one-to-one mapping.
   if (isa<PHINode>(V))
     NumberingPhi.erase(Num);
+  else if (isa<BasicBlock>(V))
+    NumberingBB.erase(Num);
 }
 
 /// verifyRemoved - Verify that the value is removed from all internal data
@@ -2294,15 +2350,39 @@ bool GVNPass::ValueTable::areCallValsEqual(uint32_t Num, uint32_t NewNum,
 uint32_t GVNPass::ValueTable::phiTranslateImpl(const BasicBlock *Pred,
                                                const BasicBlock *PhiBlock,
                                                uint32_t Num, GVNPass &Gvn) {
+  // See if we can refine the value number by looking at the PN incoming value
+  // for the given predecessor.
   if (PHINode *PN = NumberingPhi[Num]) {
-    for (unsigned i = 0; i != PN->getNumIncomingValues(); ++i) {
-      if (PN->getParent() == PhiBlock && PN->getIncomingBlock(i) == Pred)
-        if (uint32_t TransVal = lookup(PN->getIncomingValue(i), false))
-          return TransVal;
-    }
+    if (PN->getParent() == PhiBlock)
+      for (unsigned i = 0; i != PN->getNumIncomingValues(); ++i)
+        if (PN->getIncomingBlock(i) == Pred)
+          if (uint32_t TransVal = lookup(PN->getIncomingValue(i), false))
+            return TransVal;
     return Num;
   }
 
+  if (BasicBlock *BB = NumberingBB[Num]) {
+    assert(MSSA && "NumberingBB is non-empty only when using MemorySSA");
+    // Value numbers of basic blocks are used to represent memory state in
+    // load/store instructions and read-only function calls when said state is
+    // set by a MemoryPhi.
+    if (BB != PhiBlock)
+      return Num;
+    MemoryPhi *MPhi = MSSA->getMemoryAccess(BB);
+    for (unsigned i = 0, N = MPhi->getNumIncomingValues(); i != N; ++i) {
+      if (MPhi->getIncomingBlock(i) != Pred)
+        continue;
+      MemoryAccess *MA = MPhi->getIncomingValue(i);
+      if (auto *PredPhi = dyn_cast<MemoryPhi>(MA))
+        return lookupOrAdd(PredPhi->getBlock());
+      if (MSSA->isLiveOnEntryDef(MA))
+        return lookupOrAdd(&BB->getParent()->getEntryBlock());
+      return lookupOrAdd(cast<MemoryUseOrDef>(MA)->getMemoryInst());
+    }
+    llvm_unreachable(
+        "CFG/MemorySSA mismatch: predecessor not found among incoming blocks");
+  }
+
   // If there is any value related with Num is defined in a BB other than
   // PhiBlock, it cannot depend on a phi in PhiBlock without going through
   // a backedge. We can do an early exit in that case to save compile time.
@@ -2337,7 +2417,7 @@ uint32_t GVNPass::ValueTable::phiTranslateImpl(const BasicBlock *Pred,
   }
 
   if (uint32_t NewNum = expressionNumbering[Exp]) {
-    if (Exp.opcode == Instruction::Call && NewNum != Num)
+    if (!MSSA && Exp.opcode == Instruction::Call && NewNum != Num)
       return areCallValsEqual(Num, NewNum, Pred, PhiBlock, Gvn) ? NewNum : Num;
     return NewNum;
   }
@@ -2738,6 +2818,7 @@ bool GVNPass::runImpl(Function &F, AssumptionCache &RunAC, DominatorTree &RunDT,
   ICF = &ImplicitCFT;
   this->LI = &LI;
   VN.setMemDep(MD);
+  VN.setMemorySSA(MSSA);
   ORE = RunORE;
   InvalidBlockRPONumbers = true;
   MemorySSAUpdater Updater(MSSA);

if (uint32_t TransVal = lookup(PN->getIncomingValue(i), false))
return TransVal;
}
if (PN->getParent() == PhiBlock)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just hoisted the check out of the loop, can also drop as not strictly related to the patch.

@@ -596,10 +617,37 @@ uint32_t GVNPass::ValueTable::lookupOrAddCall(CallInst *C) {
return v;
}

if (MSSA && AA->onlyReadsMemory(C)) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think this (and the one below) should check isMemorySSAEnabled too as MSSA and MemDep may be both enabled.

@@ -2337,7 +2417,7 @@ uint32_t GVNPass::ValueTable::phiTranslateImpl(const BasicBlock *Pred,
}

if (uint32_t NewNum = expressionNumbering[Exp]) {
if (Exp.opcode == Instruction::Call && NewNum != Num)
if (!MSSA && Exp.opcode == Instruction::Call && NewNum != Num)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not completely sure why we need this here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove this change first if it doesn't break any existing regression test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed should be able to drop this.

Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check which tests are broken after turning on GVNEnableMemorySSA?

@@ -476,6 +476,27 @@ void GVNPass::ValueTable::add(Value *V, uint32_t num) {
NumberingPhi[num] = PN;
}

// Include the incoming memory state into the hash of the expression for the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use /// for header comments.

@antoniofrighetto
Copy link
Contributor Author

Can you check which tests are broken after turning on GVNEnableMemorySSA?

Yes, thanks, I'll be taking a look at those as well as at the compile-time regression hopefully soon.

@antoniofrighetto
Copy link
Contributor Author

Can you check which tests are broken after turning on GVNEnableMemorySSA?

Were you referring to dtcxzyw/llvm-opt-benchmark#1995 or some other tests that now appear to be broken?

@dtcxzyw
Copy link
Member

dtcxzyw commented Mar 5, 2025

Can you check which tests are broken after turning on GVNEnableMemorySSA?

Were you referring to dtcxzyw/llvm-opt-benchmark#1995 or some other tests that now appear to be broken?

I mean the existing regression tests (with MemDep on). They should be helpful as we are planning to replace MemDep with MemSSA in the future.

@antoniofrighetto
Copy link
Contributor Author

Can you check which tests are broken after turning on GVNEnableMemorySSA?

Were you referring to dtcxzyw/llvm-opt-benchmark#1995 or some other tests that now appear to be broken?

I mean the existing regression tests (with MemDep on). They should be helpful as we are planning to replace MemDep with MemSSA in the future.

The transition is still incomplete. I think it would be desirable to look at them in batch before turning MemSSAEnabled on, once the whole series of patches have landed.

While migrating towards MemorySSA, account for the memory state modeled
by MemorySSA by hashing it, when computing the symbolic expressions for
the memory operations. Likewise, when phi-translating while walking the
CFG for PRE possibilities, see if the value number of an operand may be
refined with one of the value from the incoming edges of the MemoryPhi
associated to the current phi.
@antoniofrighetto antoniofrighetto force-pushed the perf/gvn-memssa-add-memorystate branch from bd045a3 to b4e2bbc Compare April 2, 2025 11:15
@antoniofrighetto
Copy link
Contributor Author

Some bottlenecks during linking: https://llvm-compile-time-tracker.com/compare.php?from=d7afafdbc464e65c56a0a1d77bad426aa7538306&to=b4e2bbc42d1582b36b0d89ebdd10bd8113af8d6a&stat=instructions:u.

I have been profiling this, and it seems like the new increase stems from higher lookupOrAdd hit (increased by ~7% in stage1-O3 when taking 7zip/UI/Common/SetProperties.cpp.o, invoked from newly-added lookupOrAddLoadStore).

  1. Even though lookupAdd is recursively called when including the memory state, arguably, this should just a lookup as the value for the memory access should already be visited (if it's a call/load/store, whereas should be added to the map if it's a basic block)? Not sure if this can be done better rather than having a hashmap.
  2. Attempts to favour capacity over size while resizing the vector in assignExpNewValueNum seem to give little improvements in stage-1 but regressions in stage-2: https://llvm-compile-time-tracker.com/compare.php?from=6cb2f6de9b3cf0e72b7d45c9fc149457b3462ca3&to=767cd78b3a9c54c17dd20e7e99a7158174af5924&stat=instructions:u (regressions if we employ a DenseMap instead of resizing the vector).

To move this forward, I'd suggest temporarily having load / store handling subject to IsMSSAEnabled (https://llvm-compile-time-tracker.com/compare.php?from=d7afafdbc464e65c56a0a1d77bad426aa7538306&to=f924a3b5278b047488f1cd228342821171cc477a&stat=instructions:u), and revisiting this afterwards while turning off MemDep. Although now visiting load/stores incur some overhead, I'd expect this to level off as we switch to a computationally cheaper analysis. Any thoughts? @alinas, @nikic

@antoniofrighetto
Copy link
Contributor Author

Compile-time stable (https://llvm-compile-time-tracker.com/compare.php?from=7ed4ff374bc659ab1478f58eb76c08b7c1a83961&to=aecf7cffc53e45508b81967ea2fc734ab40298f9&stat=instructions:u) wrt initial b4e2bbc. A threshold for limiting non-local load/store processing might be required in upcoming patches. Should be on track for now.

@antoniofrighetto
Copy link
Contributor Author

Kind ping.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants