Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[VPlan] Manage noalias/alias_scope metadata in VPlan. (NFC) #136450

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

fhahn
Copy link
Contributor

@fhahn fhahn commented Apr 19, 2025

Use VPIRMetadata added in #135272
to also manage no-alias metadata added by versioning.

Note that this means we have to build the no-alias metadata up-front
once. If it is not used, it will be discarded automatically.

Compile-time impact is neutral:
https://llvm-compile-time-tracker.com/compare.php?from=38bf1af41c5425a552a53feb13c71d82873f1c18&to=2fd7844cfdf5ec0f1c2ce0b9b3ae0763245b6922&stat=instructions:u

@llvmbot
Copy link
Member

llvmbot commented Apr 19, 2025

@llvm/pr-subscribers-vectorizers

@llvm/pr-subscribers-llvm-transforms

Author: Florian Hahn (fhahn)

Changes

Use VPIRMetadata added in #135272
to also manage no-alias metadata added by versioning.

Note that this means we have to build the no-alias metadata up-front
once. If it is not used, it will be discarded automatically.


Patch is 53.26 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/136450.diff

10 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h (+2-1)
  • (modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+56-37)
  • (modified) llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h (+12-2)
  • (modified) llvm/lib/Transforms/Vectorize/VPlan.cpp (+2-21)
  • (modified) llvm/lib/Transforms/Vectorize/VPlan.h (+83-50)
  • (modified) llvm/lib/Transforms/Vectorize/VPlanHelpers.h (-22)
  • (modified) llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp (+22-20)
  • (modified) llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp (+24-17)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/store-costs-sve.ll (+6-11)
  • (modified) llvm/unittests/Transforms/Vectorize/VPlanTest.cpp (+13-11)
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h b/llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
index f639f0adb9c43..d46c14f53b24f 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
@@ -36,6 +36,7 @@ class LoopVectorizationLegality;
 class LoopVectorizationCostModel;
 class PredicatedScalarEvolution;
 class LoopVectorizeHints;
+class LoopVersioning;
 class OptimizationRemarkEmitter;
 class TargetTransformInfo;
 class TargetLibraryInfo;
@@ -515,7 +516,7 @@ class LoopVectorizationPlanner {
   /// returned VPlan is valid for. If no VPlan can be built for the input range,
   /// set the largest included VF to the maximum VF for which no plan could be
   /// built.
-  VPlanPtr tryToBuildVPlanWithVPRecipes(VFRange &Range);
+  VPlanPtr tryToBuildVPlanWithVPRecipes(VFRange &Range, LoopVersioning *LVer);
 
   /// Build VPlans for power-of-2 VF's between \p MinVF and \p MaxVF inclusive,
   /// according to the information gathered by Legal when it checked if it is
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 7a5f618d09e95..f965d53f91bfb 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -2371,7 +2371,7 @@ void InnerLoopVectorizer::scalarizeInstruction(const Instruction *Instr,
       InputLane = VPLane::getFirstLane();
     Cloned->setOperand(I.index(), State.get(Operand, InputLane));
   }
-  State.addNewMetadata(Cloned, Instr);
+  RepRecipe->applyMetadata(Cloned);
 
   // Place the cloned scalar in the new loop.
   State.Builder.Insert(Cloned);
@@ -7989,24 +7989,6 @@ DenseMap<const SCEV *, Value *> LoopVectorizationPlanner::executePlan(
   if (VectorizingEpilogue)
     VPlanTransforms::removeDeadRecipes(BestVPlan);
 
-  // Only use noalias metadata when using memory checks guaranteeing no overlap
-  // across all iterations.
-  const LoopAccessInfo *LAI = ILV.Legal->getLAI();
-  std::unique_ptr<LoopVersioning> LVer = nullptr;
-  if (LAI && !LAI->getRuntimePointerChecking()->getChecks().empty() &&
-      !LAI->getRuntimePointerChecking()->getDiffChecks()) {
-
-    //  We currently don't use LoopVersioning for the actual loop cloning but we
-    //  still use it to add the noalias metadata.
-    //  TODO: Find a better way to re-use LoopVersioning functionality to add
-    //        metadata.
-    LVer = std::make_unique<LoopVersioning>(
-        *LAI, LAI->getRuntimePointerChecking()->getChecks(), OrigLoop, LI, DT,
-        PSE.getSE());
-    State.LVer = &*LVer;
-    State.LVer->prepareNoAliasMetadata();
-  }
-
   ILV.printDebugTracesAtStart();
 
   //===------------------------------------------------===//
@@ -8597,13 +8579,14 @@ VPRecipeBuilder::tryToWidenMemory(Instruction *I, ArrayRef<VPValue *> Operands,
     Builder.insert(VectorPtr);
     Ptr = VectorPtr;
   }
+  auto Metadata = getMetadataToPropagate(I);
   if (LoadInst *Load = dyn_cast<LoadInst>(I))
     return new VPWidenLoadRecipe(*Load, Ptr, Mask, Consecutive, Reverse,
-                                 I->getDebugLoc());
+                                 Metadata, I->getDebugLoc());
 
   StoreInst *Store = cast<StoreInst>(I);
   return new VPWidenStoreRecipe(*Store, Ptr, Operands[0], Mask, Consecutive,
-                                Reverse, I->getDebugLoc());
+                                Reverse, Metadata, I->getDebugLoc());
 }
 
 /// Creates a VPWidenIntOrFpInductionRecpipe for \p Phi. If needed, it will also
@@ -8745,6 +8728,7 @@ VPSingleDefRecipe *VPRecipeBuilder::tryToWidenCall(CallInst *CI,
                 Range);
   if (ShouldUseVectorIntrinsic)
     return new VPWidenIntrinsicRecipe(*CI, ID, Ops, CI->getType(),
+                                      getMetadataToPropagate(CI),
                                       CI->getDebugLoc());
 
   Function *Variant = nullptr;
@@ -8798,7 +8782,8 @@ VPSingleDefRecipe *VPRecipeBuilder::tryToWidenCall(CallInst *CI,
     }
 
     Ops.push_back(Operands.back());
-    return new VPWidenCallRecipe(CI, Variant, Ops, CI->getDebugLoc());
+    return new VPWidenCallRecipe(CI, Variant, Ops, getMetadataToPropagate(CI),
+                                 CI->getDebugLoc());
   }
 
   return nullptr;
@@ -8836,7 +8821,8 @@ VPWidenRecipe *VPRecipeBuilder::tryToWiden(Instruction *I,
           Plan.getOrAddLiveIn(ConstantInt::get(I->getType(), 1u, false));
       auto *SafeRHS = Builder.createSelect(Mask, Ops[1], One, I->getDebugLoc());
       Ops[1] = SafeRHS;
-      return new VPWidenRecipe(*I, make_range(Ops.begin(), Ops.end()));
+      return new VPWidenRecipe(*I, make_range(Ops.begin(), Ops.end()),
+                               getMetadataToPropagate(I));
     }
     [[fallthrough]];
   }
@@ -8882,7 +8868,8 @@ VPWidenRecipe *VPRecipeBuilder::tryToWiden(Instruction *I,
       // For other binops, the legacy cost model only checks the second operand.
       NewOps[1] = GetConstantViaSCEV(NewOps[1]);
     }
-    return new VPWidenRecipe(*I, make_range(NewOps.begin(), NewOps.end()));
+    return new VPWidenRecipe(*I, make_range(NewOps.begin(), NewOps.end()),
+                             getMetadataToPropagate(I));
   }
   case Instruction::ExtractValue: {
     SmallVector<VPValue *> NewOps(Operands);
@@ -8891,7 +8878,8 @@ VPWidenRecipe *VPRecipeBuilder::tryToWiden(Instruction *I,
     assert(EVI->getNumIndices() == 1 && "Expected one extractvalue index");
     unsigned Idx = EVI->getIndices()[0];
     NewOps.push_back(Plan.getOrAddLiveIn(ConstantInt::get(I32Ty, Idx, false)));
-    return new VPWidenRecipe(*I, make_range(NewOps.begin(), NewOps.end()));
+    return new VPWidenRecipe(*I, make_range(NewOps.begin(), NewOps.end()),
+                             getMetadataToPropagate(I));
   }
   };
 }
@@ -8978,8 +8966,9 @@ VPRecipeBuilder::handleReplication(Instruction *I, ArrayRef<VPValue *> Operands,
   assert((Range.Start.isScalar() || !IsUniform || !IsPredicated ||
           (Range.Start.isScalable() && isa<IntrinsicInst>(I))) &&
          "Should not predicate a uniform recipe");
-  auto *Recipe = new VPReplicateRecipe(
-      I, make_range(Operands.begin(), Operands.end()), IsUniform, BlockInMask);
+  auto *Recipe =
+      new VPReplicateRecipe(I, make_range(Operands.begin(), Operands.end()),
+                            IsUniform, BlockInMask, getMetadataToPropagate(I));
   return Recipe;
 }
 
@@ -9096,6 +9085,20 @@ bool VPRecipeBuilder::getScaledReductions(
   return false;
 }
 
+SmallVector<std::pair<unsigned, MDNode *>>
+VPRecipeBuilder::getMetadataToPropagate(Instruction *I) const {
+  SmallVector<std::pair<unsigned, MDNode *>> Metadata;
+  ::getMetadataToPropagate(I, Metadata);
+  if (LVer && isa<LoadInst, StoreInst>(I)) {
+    const auto &[AliasScopeMD, NoAliasMD] = LVer->getNoAliasMetadataFor(I);
+    if (AliasScopeMD)
+      Metadata.emplace_back(LLVMContext::MD_alias_scope, AliasScopeMD);
+    if (NoAliasMD)
+      Metadata.emplace_back(LLVMContext::MD_noalias, NoAliasMD);
+  }
+  return Metadata;
+}
+
 VPRecipeBase *VPRecipeBuilder::tryToCreateWidenRecipe(
     Instruction *Instr, ArrayRef<VPValue *> Operands, VFRange &Range) {
   // First, check for specific widening recipes that deal with inductions, Phi
@@ -9168,13 +9171,14 @@ VPRecipeBase *VPRecipeBuilder::tryToCreateWidenRecipe(
                                 make_range(Operands.begin(), Operands.end()));
 
   if (auto *SI = dyn_cast<SelectInst>(Instr)) {
-    return new VPWidenSelectRecipe(
-        *SI, make_range(Operands.begin(), Operands.end()));
+    return new VPWidenSelectRecipe(*SI,
+                                   make_range(Operands.begin(), Operands.end()),
+                                   getMetadataToPropagate(SI));
   }
 
   if (auto *CI = dyn_cast<CastInst>(Instr)) {
     return new VPWidenCastRecipe(CI->getOpcode(), Operands[0], CI->getType(),
-                                 *CI);
+                                 *CI, getMetadataToPropagate(CI));
   }
 
   return tryToWiden(Instr, Operands);
@@ -9200,7 +9204,8 @@ VPRecipeBuilder::tryToCreatePartialReduction(Instruction *Reduction,
     SmallVector<VPValue *, 2> Ops;
     Ops.push_back(Plan.getOrAddLiveIn(Zero));
     Ops.push_back(BinOp);
-    BinOp = new VPWidenRecipe(*Reduction, make_range(Ops.begin(), Ops.end()));
+    BinOp = new VPWidenRecipe(*Reduction, make_range(Ops.begin(), Ops.end()),
+                              getMetadataToPropagate(Reduction));
     Builder.insert(BinOp->getDefiningRecipe());
     ReductionOpcode = Instruction::Add;
   }
@@ -9223,10 +9228,22 @@ void LoopVectorizationPlanner::buildVPlansWithVPRecipes(ElementCount MinVF,
                                                         ElementCount MaxVF) {
   assert(OrigLoop->isInnermost() && "Inner loop expected.");
 
+  // Only use noalias metadata when using memory checks guaranteeing no overlap
+  // across all iterations.
+  const LoopAccessInfo *LAI = Legal->getLAI();
+  std::unique_ptr<LoopVersioning> LVer = nullptr;
+  if (LAI && !LAI->getRuntimePointerChecking()->getChecks().empty() &&
+      !LAI->getRuntimePointerChecking()->getDiffChecks()) {
+    LVer = std::make_unique<LoopVersioning>(
+        *LAI, LAI->getRuntimePointerChecking()->getChecks(), OrigLoop, LI, DT,
+        PSE.getSE());
+    LVer->prepareNoAliasMetadata();
+  }
+
   auto MaxVFTimes2 = MaxVF * 2;
   for (ElementCount VF = MinVF; ElementCount::isKnownLT(VF, MaxVFTimes2);) {
     VFRange SubRange = {VF, MaxVFTimes2};
-    if (auto Plan = tryToBuildVPlanWithVPRecipes(SubRange)) {
+    if (auto Plan = tryToBuildVPlanWithVPRecipes(SubRange, LVer.get())) {
       bool HasScalarVF = Plan->hasScalarVFOnly();
       // Now optimize the initial VPlan.
       if (!HasScalarVF)
@@ -9534,7 +9551,8 @@ static void addExitUsersForFirstOrderRecurrences(
 }
 
 VPlanPtr
-LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(VFRange &Range) {
+LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(VFRange &Range,
+                                                       LoopVersioning *LVer) {
 
   using namespace llvm::VPlanPatternMatch;
   SmallPtrSet<const InterleaveGroup<Instruction> *, 1> InterleaveGroups;
@@ -9580,7 +9598,7 @@ LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(VFRange &Range) {
   addCanonicalIVRecipes(*Plan, Legal->getWidestInductionType(), HasNUW, DL);
 
   VPRecipeBuilder RecipeBuilder(*Plan, OrigLoop, TLI, &TTI, Legal, CM, PSE,
-                                Builder);
+                                Builder, LVer);
 
   // ---------------------------------------------------------------------------
   // Pre-construction: record ingredients whose recipes we'll need to further
@@ -9694,8 +9712,9 @@ LoopVectorizationPlanner::tryToBuildVPlanWithVPRecipes(VFRange &Range) {
           Legal->isInvariantAddressOfReduction(SI->getPointerOperand())) {
         // Only create recipe for the final invariant store of the reduction.
         if (Legal->isInvariantStoreOfReduction(SI)) {
-          auto *Recipe =
-              new VPReplicateRecipe(SI, R.operands(), true /* IsUniform */);
+          auto *Recipe = new VPReplicateRecipe(
+              SI, R.operands(), true /* IsUniform */, /*Mask*/ nullptr,
+              RecipeBuilder.getMetadataToPropagate(SI));
           Recipe->insertBefore(*MiddleVPBB, MBIP);
         }
         R.eraseFromParent();
@@ -9881,7 +9900,7 @@ VPlanPtr LoopVectorizationPlanner::tryToBuildVPlan(VFRange &Range) {
   // Collect mapping of IR header phis to header phi recipes, to be used in
   // addScalarResumePhis.
   VPRecipeBuilder RecipeBuilder(*Plan, OrigLoop, TLI, &TTI, Legal, CM, PSE,
-                                Builder);
+                                Builder, nullptr);
   for (auto &R : Plan->getVectorLoopRegion()->getEntryBasicBlock()->phis()) {
     if (isa<VPCanonicalIVPHIRecipe>(&R))
       continue;
diff --git a/llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h b/llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h
index fd0064a34c4c9..357a68972cb43 100644
--- a/llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h
+++ b/llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h
@@ -90,6 +90,10 @@ class VPRecipeBuilder {
   /// A mapping of partial reduction exit instructions to their scaling factor.
   DenseMap<const Instruction *, unsigned> ScaledReductionMap;
 
+  /// Loop versioning instance for getting noalias metadata guaranteed by
+  /// runtime checks.
+  LoopVersioning *LVer;
+
   /// Check if \p I can be widened at the start of \p Range and possibly
   /// decrease the range such that the returned value holds for the entire \p
   /// Range. The function should not be called for memory instructions or calls.
@@ -155,9 +159,10 @@ class VPRecipeBuilder {
                   const TargetTransformInfo *TTI,
                   LoopVectorizationLegality *Legal,
                   LoopVectorizationCostModel &CM,
-                  PredicatedScalarEvolution &PSE, VPBuilder &Builder)
+                  PredicatedScalarEvolution &PSE, VPBuilder &Builder,
+                  LoopVersioning *LVer)
       : Plan(Plan), OrigLoop(OrigLoop), TLI(TLI), TTI(TTI), Legal(Legal),
-        CM(CM), PSE(PSE), Builder(Builder) {}
+        CM(CM), PSE(PSE), Builder(Builder), LVer(LVer) {}
 
   std::optional<unsigned> getScalingForReduction(const Instruction *ExitInst) {
     auto It = ScaledReductionMap.find(ExitInst);
@@ -233,6 +238,11 @@ class VPRecipeBuilder {
     }
     return Plan.getOrAddLiveIn(V);
   }
+
+  /// Returns the metatadata that can be preserved from the original instruction
+  /// \p I, including noalias metadata guaranteed by runtime checks.
+  SmallVector<std::pair<unsigned, MDNode *>>
+  getMetadataToPropagate(Instruction *I) const;
 };
 } // end namespace llvm
 
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.cpp b/llvm/lib/Transforms/Vectorize/VPlan.cpp
index fa2d95a44609a..7ca5008c607cf 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlan.cpp
@@ -220,8 +220,8 @@ VPTransformState::VPTransformState(const TargetTransformInfo *TTI,
                                    InnerLoopVectorizer *ILV, VPlan *Plan,
                                    Loop *CurrentParentLoop, Type *CanonicalIVTy)
     : TTI(TTI), VF(VF), CFG(DT), LI(LI), Builder(Builder), ILV(ILV), Plan(Plan),
-      CurrentParentLoop(CurrentParentLoop), LVer(nullptr),
-      TypeAnalysis(CanonicalIVTy), VPDT(*Plan) {}
+      CurrentParentLoop(CurrentParentLoop), TypeAnalysis(CanonicalIVTy),
+      VPDT(*Plan) {}
 
 Value *VPTransformState::get(const VPValue *Def, const VPLane &Lane) {
   if (Def->isLiveIn())
@@ -355,25 +355,6 @@ BasicBlock *VPTransformState::CFGState::getPreheaderBBFor(VPRecipeBase *R) {
   return VPBB2IRBB[LoopRegion->getPreheaderVPBB()];
 }
 
-void VPTransformState::addNewMetadata(Instruction *To,
-                                      const Instruction *Orig) {
-  // If the loop was versioned with memchecks, add the corresponding no-alias
-  // metadata.
-  if (LVer && isa<LoadInst, StoreInst>(Orig))
-    LVer->annotateInstWithNoAlias(To, Orig);
-}
-
-void VPTransformState::addMetadata(Value *To, Instruction *From) {
-  // No source instruction to transfer metadata from?
-  if (!From)
-    return;
-
-  if (Instruction *ToI = dyn_cast<Instruction>(To)) {
-    propagateMetadata(ToI, From);
-    addNewMetadata(ToI, From);
-  }
-}
-
 void VPTransformState::setDebugLocFrom(DebugLoc DL) {
   const DILocation *DIL = DL;
   // When a FSDiscriminator is enabled, we don't need to add the multiply
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.h b/llvm/lib/Transforms/Vectorize/VPlan.h
index 7084676af6d5b..585fa3cd12ef6 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.h
+++ b/llvm/lib/Transforms/Vectorize/VPlan.h
@@ -1190,28 +1190,49 @@ struct VPIRPhi : public VPIRInstruction {
 #endif
 };
 
+using MDArrayRef = ArrayRef<std::pair<unsigned, MDNode *>>;
+
+/// Helper to manage IR metadata for recipes. It filters out metadata that
+/// cannot be proagated.
+class VPIRMetadata {
+  SmallVector<std::pair<unsigned, MDNode *>> Metadata;
+
+protected:
+  VPIRMetadata(MDArrayRef Metadata) : Metadata(Metadata) {}
+
+public:
+  /// Add all metadata to \p V if it is an instruction.
+  void applyMetadata(Value *V) const;
+
+  /// Return the IR metadata.
+  MDArrayRef getMetadata() const { return Metadata; }
+};
+
 /// VPWidenRecipe is a recipe for producing a widened instruction using the
 /// opcode and operands of the recipe. This recipe covers most of the
 /// traditional vectorization cases where each recipe transforms into a
 /// vectorized version of itself.
-class VPWidenRecipe : public VPRecipeWithIRFlags {
+class VPWidenRecipe : public VPRecipeWithIRFlags, public VPIRMetadata {
   unsigned Opcode;
 
 protected:
   template <typename IterT>
   VPWidenRecipe(unsigned VPDefOpcode, Instruction &I,
-                iterator_range<IterT> Operands)
-      : VPRecipeWithIRFlags(VPDefOpcode, Operands, I), Opcode(I.getOpcode()) {}
+                iterator_range<IterT> Operands, MDArrayRef Metadata)
+      : VPRecipeWithIRFlags(VPDefOpcode, Operands, I), VPIRMetadata(Metadata),
+        Opcode(I.getOpcode()) {}
 
 public:
   template <typename IterT>
-  VPWidenRecipe(Instruction &I, iterator_range<IterT> Operands)
-      : VPWidenRecipe(VPDef::VPWidenSC, I, Operands) {}
+  VPWidenRecipe(Instruction &I, iterator_range<IterT> Operands,
+                MDArrayRef Metadata)
+      : VPWidenRecipe(VPDef::VPWidenSC, I, Operands, Metadata) {}
 
   ~VPWidenRecipe() override = default;
 
   VPWidenRecipe *clone() override {
-    auto *R = new VPWidenRecipe(*getUnderlyingInstr(), operands());
+    auto *R =
+        new VPWidenRecipe(*getUnderlyingInstr(), operands(), getMetadata());
     R->transferFlags(*this);
     return R;
   }
@@ -1236,7 +1257,7 @@ class VPWidenRecipe : public VPRecipeWithIRFlags {
 };
 
 /// VPWidenCastRecipe is a recipe to create vector cast instructions.
-class VPWidenCastRecipe : public VPRecipeWithIRFlags {
+class VPWidenCastRecipe : public VPRecipeWithIRFlags, public VPIRMetadata {
   /// Cast instruction opcode.
   Instruction::CastOps Opcode;
 
@@ -1245,23 +1266,23 @@ class VPWidenCastRecipe : public VPRecipeWithIRFlags {
 
 public:
   VPWidenCastRecipe(Instruction::CastOps Opcode, VPValue *Op, Type *ResultTy,
-                    CastInst &UI)
-      : VPRecipeWithIRFlags(VPDef::VPWidenCastSC, Op, UI), Opcode(Opcode),
-        ResultTy(ResultTy) {
+                    CastInst &UI, MDArrayRef Metadata)
+      : VPRecipeWithIRFlags(VPDef::VPWidenCastSC, Op, UI),
+        VPIRMetadata(Metadata), Opcode(Opcode), ResultTy(ResultTy) {
     assert(UI.getOpcode() == Opcode &&
            "opcode of underlying cast doesn't match");
   }
 
   VPWidenCastRecipe(Instruction::CastOps Opcode, VPValue *Op, Type *ResultTy)
-      : VPRecipeWithIRFlags(VPDef::VPWidenCastSC, Op), Opcode(Opcode),
-        ResultTy(ResultTy) {}
+      : VPRecipeWithIRFlags(VPDef::VPWidenCastSC, Op), VPIRMetadata({}),
+        Opcode(Opcode), ResultTy(ResultTy) {}
 
   ~VPWidenCastRecipe() override = default;
 
   VPWidenCastRecipe *clone() override {
     if (auto *UV = getUnderlyingValue())
       return new VPWidenCastRecipe(Opcode, getOperand(0), ResultTy,
-                                   *cast<CastInst>(UV));
+                                   *cast<CastInst>(UV), getMetadata());
 
     return new VPWidenCastRecipe(Opcode, getOperand(0), ResultTy);
   }
@@ -1288,7 +1309,7 @@ class VPWidenCastRecipe : public VPRecipeWithIRFlags {
 };
 
 /// A recipe for widening vector intrinsics.
-class VPWidenIntrinsicRecipe : public VPRecipeWithIRFlags {
+class VPWidenIntrinsicRecipe : public VPRecipeWithIRFlags, public VPIRMetadata {
   /// ID of the vector intrinsic to widen.
   Intrinsic::ID VectorIntrinsicID;
 
@@ -1307,18 +1328,19 @@ class VPWidenIntrinsicRecipe : public VPRecipeWithIRFlags {
 public:
   VPWidenIntrinsicRecipe(CallInst &CI, Intrinsic::ID VectorIntrinsicID,
                          ArrayRef<VPValue *> CallArguments, Type *Ty,
-                         DebugLoc DL = {})
+                         MDArrayRef Metadata, DebugLoc DL = {})
       : VPRecipeWithIRFlags(VPDef::VPWidenIntrinsicSC, CallArguments, CI),
-        VectorIntrinsicID(VectorIntrinsicID), ResultTy(Ty),
-        MayReadFromMemory(CI.mayReadFromMemory()),
+        VPIRMetadata(Metadata), VectorIntrinsicID(VectorIntrinsicID),
+        ResultTy(Ty), MayRe...
[truncated]

Copy link
Contributor Author

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated now that #135272 has landed.

Main update includes updating VPWidenMemoryRecipes/VPReplicateRecipe to take metadata explicitly as arguments, as it may be augmented with extra metadata.

@ayalz
Copy link
Collaborator

ayalz commented Apr 25, 2025

Could the no-alias metadata added by loop versioning be managed by a dedicated VPlan-to-VPlan pass, continuing to capture only metadata that propagates from underlying instructions during initial VPlan/recipe construction? May require introducing API to update recipe metadata, but hopefully simple as it merely augments additional nodes?

@fhahn
Copy link
Contributor Author

fhahn commented Apr 25, 2025

Could the no-alias metadata added by loop versioning be managed by a dedicated VPlan-to-VPlan pass, continuing to capture only metadata that propagates from underlying instructions during initial VPlan/recipe construction? May require introducing API to update recipe metadata, but hopefully simple as it merely augments additional nodes?

We could, but we at the moment this would still mean we will be tied to the underlying instruction, which is then used to look up the metadata, which is why added it on construction of the recipes.

We could add VPIRMetadata to VPInstruction, and add the metadata before creating widen recipes (at that point we must have the underlying instruction available, although it might warrant even more a separate class for such VPInstructions at the early stage, raises the question what it should be called and its relationship with VPIRInstruction).

Or change the lookup of the metadata to be based on SCEV, which would require building SCEVs from various VP recipes. Both may also be good to do as follow-ups

Copy link
Collaborator

@ayalz ayalz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the no-alias metadata added by loop versioning be managed by a dedicated VPlan-to-VPlan pass, continuing to capture only metadata that propagates from underlying instructions during initial VPlan/recipe construction? May require introducing API to update recipe metadata, but hopefully simple as it merely augments additional nodes?

We could, but we at the moment this would still mean we will be tied to the underlying instruction, which is then used to look up the metadata, which is why added it on construction of the recipes.

Ahh, right. Agree that data based on underlying IR should be recorded when building VPlan and its initial recipes. Perhaps its time to start relieving the (passing and) recording of Instructions in recipes.

We could add VPIRMetadata to VPInstruction, and add the metadata before creating widen recipes (at that point we must have the underlying instruction available, although it might warrant even more a separate class for such VPInstructions at the early stage, raises the question what it should be called and its relationship with VPIRInstruction).

Or change the lookup of the metadata to be based on SCEV, which would require building SCEVs from various VP recipes. Both may also be good to do as follow-ups

@@ -2364,7 +2364,7 @@ void InnerLoopVectorizer::scalarizeInstruction(const Instruction *Instr,
InputLane = VPLane::getFirstLane();
Cloned->setOperand(I.index(), State.get(Operand, InputLane));
}
State.addNewMetadata(Cloned, Instr);
RepRecipe->applyMetadata(*Cloned);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better placed next to applyFlags() above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks!

@@ -1198,10 +1200,14 @@ class VPIRMetadata {
protected:
VPIRMetadata() {}
VPIRMetadata(Instruction &I) { getMetadataToPropagate(&I, Metadata); }
VPIRMetadata(MDArrayRef Metadata) : Metadata(Metadata) {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a copy constructor, i.e., given a (const) VPIRMetada& to clone / copy from, who should have access to the latter's Metadata w/o exposing getMetadata()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, but then we would need to variants of constructors, one taking the metadata as list (or taking LVer), as well as one taking VPIRMetadata to clone?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, hopefully turning getMetadata() protected or private - or removing it?

(There could be one constructor taking an Instruction and an optional complementary no-alias Metadata, but then cloning a recipe may be difficult because all metadata is stored together in VPIRMetadata...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to always use the one taking VPIRMetadata, thanks

@@ -1198,10 +1200,14 @@ class VPIRMetadata {
protected:
VPIRMetadata() {}
VPIRMetadata(Instruction &I) { getMetadataToPropagate(&I, Metadata); }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should another VPIRMetadata(Instruction &I, LoopVersioning *LVer) constructor be added, for cases where I alone does not suffice to capture all its metadata? Effectively inlining VPRecipeBuilder::getMetadataToPropagate() into it. This includes augmenting the metadata propagated from I with newly added noalias metadata, so more "create"/compose/construct than "get".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above, may be better to avoid leaking more implementation details to recipe definitions?

@@ -2459,7 +2465,7 @@ class VPReductionEVLRecipe : public VPReductionRecipe {
/// copies of the original scalar type, one per lane, instead of producing a
/// single copy of widened type for all lanes. If the instruction is known to be
/// uniform only one copy, per lane zero, will be generated.
class VPReplicateRecipe : public VPRecipeWithIRFlags {
class VPReplicateRecipe : public VPRecipeWithIRFlags, public VPIRMetadata {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should replication of loads/stores have a separate recipe, if only they hold metadata? Would that separation be helpful in general.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to break up the recipes, to remove the reliance on the underlying IR instruction. Other case to split off would probably be calls, but probably best done separately, as this will require quite a bit of work (currently it is all based on cloning the existing IR instruction).

VPReplicateRecipes other than loads should also have metadata; currently they use all metadata from the original instruction (due to cloning); this may be incorrect, as only certain IR metadata should be preserved (like widen-recipes, which only preserve a subset of the original metadata), again probably best to fix separately.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit message should mention that this patch also affects how ReplicateRecipe handle metadata. Does it fix the possibly incorrect behavior mentioned above? A test may be helpful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't fix it yet, it doesn't change the behavior as the exsting cloning of instructions will preserve all metadata, the noalias MD is now added by VPIRMetadata.

// Add metadata to the load, but set the result to the reverse shuffle, if
// needed.
State.addNewMetadata(cast<Instruction>(NewLI), &Ingredient);
// Add metadata to the load, but setVectorValue to the reverse shuffle.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Original "if needed" still holds?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could retain it, but it seems a bit confusing as we are not checking here if it is needed; applyMetadata implicitly only adds the metadata 'when needed', if no metadata is needed, the recipe won't have metadata.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment is apparently misleading ... IIUC it tries to explain why applyMetadata() operates on the original NewLI ("the load") while State.set() operates on the possibly reversed load - what may or may not be needed ("if needed") is the reverse shuffle?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dropped the comment, as the behavior should be clear from the code.

@@ -313,13 +313,8 @@ attributes #1 = { vscale_range(1,16) "target-features"="+sve" }
; DEFAULT: [[META8]] = !{[[META9:![0-9]+]]}
; DEFAULT: [[META9]] = distinct !{[[META9]], [[META7]]}
; DEFAULT: [[LOOP10]] = distinct !{[[LOOP10]], [[META1]], [[META2]]}
; DEFAULT: [[META11]] = !{[[META12:![0-9]+]]}
; DEFAULT: [[META12]] = distinct !{[[META12]], [[META13:![0-9]+]]}
; DEFAULT: [[META13]] = distinct !{[[META13]], !"LVerDomain"}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok to drop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think so, we now share the same metadata for both epilogue and main vector loops, which is fine, as the same memory runtime checks guard both loops, and guarantee that the accesses do not alias across the full iteration domain of the original loop.

Comment on lines 78 to 93
SmallVector<std::pair<unsigned, MDNode *>> Metadata;
::getMetadataToPropagate(Inst, Metadata);
NewRecipe = new VPWidenLoadRecipe(
*Load, Ingredient.getOperand(0), nullptr /*Mask*/,
false /*Consecutive*/, false /*Reverse*/,
false /*Consecutive*/, false /*Reverse*/, Metadata,
Ingredient.getDebugLoc());
} else if (StoreInst *Store = dyn_cast<StoreInst>(Inst)) {
SmallVector<std::pair<unsigned, MDNode *>> Metadata;
::getMetadataToPropagate(Inst, Metadata);
NewRecipe = new VPWidenStoreRecipe(
*Store, Ingredient.getOperand(1), Ingredient.getOperand(0),
nullptr /*Mask*/, false /*Consecutive*/, false /*Reverse*/,
Ingredient.getDebugLoc());
Metadata, Ingredient.getDebugLoc());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the existing constructors continue to be used here, for now, as they need to build their metadata based on Inst alone? Should widen load/store recipes be equipped with noalias metadata also when formed from VPInstructions, somehow (i.e., TODO). Perhaps VPInstructions with load or store opcode should be specialized, as in VPInstructionWithType.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would require another constructor variant, not taking metadata. Ideally we would add the metadata to the initial VPInstructions; that would mean larger updates to be consistent, i.e. all recipes that require metadata should take it from their original VPInstructions, so need to take a list of metadata. May be best done after this patch, once all recipes use the metadata from VPIRMetadata?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either another constructor variant, or falling back to use the metadata that can be propagated from Inst alone (as done now) when the metadata parameter passed is null or empty?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, but this is only used for the native path, not sure if it is worth adding a separate constructor?

Comment on lines 8493 to 8458
auto Metadata = getMetadataToPropagate(I);
if (LoadInst *Load = dyn_cast<LoadInst>(I))
return new VPWidenLoadRecipe(*Load, Ptr, Mask, Consecutive, Reverse,
I->getDebugLoc());
Metadata, I->getDebugLoc());

StoreInst *Store = cast<StoreInst>(I);
return new VPWidenStoreRecipe(*Store, Ptr, Operands[0], Mask, Consecutive,
Reverse, I->getDebugLoc());
Reverse, Metadata, I->getDebugLoc());
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to avoid passing both the underlying instruction and the metadata as parameters to these constructors, and considering the latter can no longer be derived from the former (in this case, another case below they can(?)), would it be a good time to retire passing the former? I.e., record the Alignment in VPWidenMemoryRecipe (along with Consecutive and Reversed), and the result type (for loads).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I was planning on removing the other information derived from Ingredient soon, inclduing alignment and result type, but probably best done separately.

Comment on lines 8998 to 9010
SmallVector<std::pair<unsigned, MDNode *>>
VPRecipeBuilder::getMetadataToPropagate(Instruction *I) const {
SmallVector<std::pair<unsigned, MDNode *>> Metadata;
::getMetadataToPropagate(I, Metadata);
if (LVer && isa<LoadInst, StoreInst>(I)) {
const auto &[AliasScopeMD, NoAliasMD] = LVer->getNoAliasMetadataFor(I);
if (AliasScopeMD)
Metadata.emplace_back(LLVMContext::MD_alias_scope, AliasScopeMD);
if (NoAliasMD)
Metadata.emplace_back(LLVMContext::MD_noalias, NoAliasMD);
}
return Metadata;
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be constructing a VPIRMetadata given I and LVer, rather than overloading getMetadataToPropagate(I).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could provide such a constructor, but then we would need to pass LVer to the recipe constructor, which seems like leaking more implementation details of LV into recipes. Passing the metadata that's needed as list seems more general, and more cleanly separates the interfaces. WDYT?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that it's better to avoid passing LVer to recipe constructors. Perhaps also better to avoid passing them Metadata lists as well - and employ VPIRMetadata to pass/clone metadata between recipes instead? I.e., have getRecipeMetadata() return a VPIRMetadata which it constructs using an Instruction and LVer (and appropriate VPIRMetadata constructor). Then pass that to recipe constructors, who effectively clone it. Thereby hopefully avoiding a public getMetadata(). Sounds reasonable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks

@fhahn fhahn force-pushed the perf/vplan-noalias-md branch from 55f1a6b to 989dd75 Compare April 28, 2025 09:53
Comment on lines 8868 to 8879
auto *Recipe =
new VPReplicateRecipe(I, make_range(Operands.begin(), Operands.end()),
IsUniform, BlockInMask, getMetadataToPropagate(I));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/make_range(Operands.begin(), Operands.end())/Operands while we're here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need some further changes, done separately in #137798 (and the ArrayRef change)

Comment on lines -9766 to +9700
Builder);
Builder, nullptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use default argument = nullptr here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done thanks

Comment on lines 8992 to 8999
VPRecipeBuilder::getMetadataToPropagate(Instruction *I) const {
SmallVector<std::pair<unsigned, MDNode *>> Metadata;
::getMetadataToPropagate(I, Metadata);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice if both getMetaDataToPropagate functions have the same signature: take Metadata as second argument.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I don't have a strong preference, but returning the SmallVector directly makes for slightly more compact code at the use sites.

// Only use noalias metadata when using memory checks guaranteeing no overlap
// across all iterations.
const LoopAccessInfo *LAI = Legal->getLAI();
std::unique_ptr<LoopVersioning> LVer = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be moved to Legal to avoid the unique_ptr? Probably not worth it if it's too much work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can just construct the LoopVersioning object unconditionally here , updated, thanks

auto *Recipe =
new VPReplicateRecipe(SI, R.operands(), true /* IsUniform */);
auto *Recipe = new VPReplicateRecipe(
SI, R.operands(), true /* IsUniform */, /*Mask*/ nullptr,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: align the comment style, like:

Suggested change
SI, R.operands(), true /* IsUniform */, /*Mask*/ nullptr,
SI, R.operands(), true /* IsUniform */, nullptr /* Mask */,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yes this uses a different style, fixed thanks

@fhahn fhahn force-pushed the perf/vplan-noalias-md branch from 989dd75 to d4b85c7 Compare April 29, 2025 21:47
@@ -1198,10 +1200,14 @@ class VPIRMetadata {
protected:
VPIRMetadata() {}
VPIRMetadata(Instruction &I) { getMetadataToPropagate(&I, Metadata); }
VPIRMetadata(MDArrayRef Metadata) : Metadata(Metadata) {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, hopefully turning getMetadata() protected or private - or removing it?

(There could be one constructor taking an Instruction and an optional complementary no-alias Metadata, but then cloning a recipe may be difficult because all metadata is stored together in VPIRMetadata...)

// Add metadata to the load, but set the result to the reverse shuffle, if
// needed.
State.addNewMetadata(cast<Instruction>(NewLI), &Ingredient);
// Add metadata to the load, but setVectorValue to the reverse shuffle.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment is apparently misleading ... IIUC it tries to explain why applyMetadata() operates on the original NewLI ("the load") while State.set() operates on the possibly reversed load - what may or may not be needed ("if needed") is the reverse shuffle?

@@ -9013,6 +8997,20 @@ bool VPRecipeBuilder::getScaledReductions(
return false;
}

SmallVector<std::pair<unsigned, MDNode *>>
VPRecipeBuilder::getMetadataToPropagate(Instruction *I) const {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wraps both getMetadataToPropagate() and getNoAliasMetadataFor(). Rather than overriding the former, how about calling it

Suggested change
VPRecipeBuilder::getMetadataToPropagate(Instruction *I) const {
VPRecipeBuilder::getRecipeMetadata(Instruction *I) const {

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed, thanks!

Comment on lines 8998 to 9010
SmallVector<std::pair<unsigned, MDNode *>>
VPRecipeBuilder::getMetadataToPropagate(Instruction *I) const {
SmallVector<std::pair<unsigned, MDNode *>> Metadata;
::getMetadataToPropagate(I, Metadata);
if (LVer && isa<LoadInst, StoreInst>(I)) {
const auto &[AliasScopeMD, NoAliasMD] = LVer->getNoAliasMetadataFor(I);
if (AliasScopeMD)
Metadata.emplace_back(LLVMContext::MD_alias_scope, AliasScopeMD);
if (NoAliasMD)
Metadata.emplace_back(LLVMContext::MD_noalias, NoAliasMD);
}
return Metadata;
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that it's better to avoid passing LVer to recipe constructors. Perhaps also better to avoid passing them Metadata lists as well - and employ VPIRMetadata to pass/clone metadata between recipes instead? I.e., have getRecipeMetadata() return a VPIRMetadata which it constructs using an Instruction and LVer (and appropriate VPIRMetadata constructor). Then pass that to recipe constructors, who effectively clone it. Thereby hopefully avoiding a public getMetadata(). Sounds reasonable?

VPRecipeBuilder::getMetadataToPropagate(Instruction *I) const {
SmallVector<std::pair<unsigned, MDNode *>> Metadata;
::getMetadataToPropagate(I, Metadata);
if (LVer && isa<LoadInst, StoreInst>(I)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
if (LVer && isa<LoadInst, StoreInst>(I)) {
if (!LVer || !isa<LoadInst, StoreInst>(I))
return Metadata;

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done thanks

Comment on lines 78 to 93
SmallVector<std::pair<unsigned, MDNode *>> Metadata;
::getMetadataToPropagate(Inst, Metadata);
NewRecipe = new VPWidenLoadRecipe(
*Load, Ingredient.getOperand(0), nullptr /*Mask*/,
false /*Consecutive*/, false /*Reverse*/,
false /*Consecutive*/, false /*Reverse*/, Metadata,
Ingredient.getDebugLoc());
} else if (StoreInst *Store = dyn_cast<StoreInst>(Inst)) {
SmallVector<std::pair<unsigned, MDNode *>> Metadata;
::getMetadataToPropagate(Inst, Metadata);
NewRecipe = new VPWidenStoreRecipe(
*Store, Ingredient.getOperand(1), Ingredient.getOperand(0),
nullptr /*Mask*/, false /*Consecutive*/, false /*Reverse*/,
Ingredient.getDebugLoc());
Metadata, Ingredient.getDebugLoc());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either another constructor variant, or falling back to use the metadata that can be propagated from Inst alone (as done now) when the metadata parameter passed is null or empty?

@@ -2459,7 +2465,7 @@ class VPReductionEVLRecipe : public VPReductionRecipe {
/// copies of the original scalar type, one per lane, instead of producing a
/// single copy of widened type for all lanes. If the instruction is known to be
/// uniform only one copy, per lane zero, will be generated.
class VPReplicateRecipe : public VPRecipeWithIRFlags {
class VPReplicateRecipe : public VPRecipeWithIRFlags, public VPIRMetadata {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit message should mention that this patch also affects how ReplicateRecipe handle metadata. Does it fix the possibly incorrect behavior mentioned above? A test may be helpful.

@@ -2493,19 +2499,20 @@ class VPReplicateRecipe : public VPRecipeWithIRFlags {
public:
template <typename IterT>
VPReplicateRecipe(Instruction *I, iterator_range<IterT> Operands,
bool IsUniform, VPValue *Mask = nullptr)
bool IsUniform, VPValue *Mask = nullptr,
ArrayRef<std::pair<unsigned, MDNode *>> Metadata = {})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: if a default/empty Metadata is passed, should getMetadataToPropagate(I) be used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably best done separately, as for now the existing metadata is ignored in VPIRMetadata, except new noalias.

@fhahn fhahn force-pushed the perf/vplan-noalias-md branch from d4b85c7 to 2a20107 Compare May 3, 2025 10:20
@@ -9010,6 +8994,20 @@ bool VPRecipeBuilder::getScaledReductions(
return false;
}

VPIRMetadata VPRecipeBuilder::getRecipeMetadata(Instruction *I) const {
SmallVector<std::pair<unsigned, MDNode *>> Metadata;
::getMetadataToPropagate(I, Metadata);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
::getMetadataToPropagate(I, Metadata);
getMetadataToPropagate(I, Metadata);

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code now gone/moved, thanks

@@ -155,9 +159,10 @@ class VPRecipeBuilder {
const TargetTransformInfo *TTI,
LoopVectorizationLegality *Legal,
LoopVectorizationCostModel &CM,
PredicatedScalarEvolution &PSE, VPBuilder &Builder)
PredicatedScalarEvolution &PSE, VPBuilder &Builder,
LoopVersioning *LVer = nullptr)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: have the caller supply null to emphasize its lack of LVer which may imply lack of metadata?

Suggested change
LoopVersioning *LVer = nullptr)
LoopVersioning *LVer)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, this was due to earlier comments, added it back for now, thanks

SmallVector<std::pair<unsigned, MDNode *>> Metadata;
::getMetadataToPropagate(I, Metadata);
if (!LVer || !isa<LoadInst, StoreInst>(I))
return {};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return {};
return {Metadata};

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code now gone/moved, thanks

@@ -9010,6 +8994,20 @@ bool VPRecipeBuilder::getScaledReductions(
return false;
}

VPIRMetadata VPRecipeBuilder::getRecipeMetadata(Instruction *I) const {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this simply return VPIRMetadata(I, LVer);, letting VPIRMetadata take care of how to construct itself given both I and LVer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved, thanks!

: VPRecipeWithIRFlags(VPDef::VPReplicateSC, Operands, *I),
IsUniform(IsUniform), IsPredicated(Mask) {
VPIRMetadata(Metadata), IsUniform(IsUniform), IsPredicated(Mask) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be

Suggested change
VPIRMetadata(Metadata), IsUniform(IsUniform), IsPredicated(Mask) {
VPIRMetadata(I, Metadata), IsUniform(IsUniform), IsPredicated(Mask) {

to propagate I's metadata when Metadata if empty, as in the default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to use VPIRMetadata(I, LVer), to be passed in

Comment on lines +81 to +82
SmallVector<std::pair<unsigned, MDNode *>> Metadata;
::getMetadataToPropagate(Inst, Metadata);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
SmallVector<std::pair<unsigned, MDNode *>> Metadata;
::getMetadataToPropagate(Inst, Metadata);
VPIRMetadata Metadata(Inst);

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done thanks

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still awaits commit? Defining Metadata here is now redundant.

Comment on lines 88 to 89
SmallVector<std::pair<unsigned, MDNode *>> Metadata;
::getMetadataToPropagate(Inst, Metadata);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done thanks

Clone = new VPReplicateRecipe(I, SinkCandidate->operands(), true);
Clone = new VPReplicateRecipe(I, SinkCandidate->operands(), true,
/*Mask*/ nullptr,
*cast<VPReplicateRecipe>(SinkCandidate));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: check if (auto RepSinkCandidate = dyn_cast<...>) above instead of if (isa<...>) and this cast?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done thanks!

@@ -2793,7 +2799,7 @@ void VPlanTransforms::narrowInterleaveGroups(VPlan &Plan, ElementCount VF,
auto *L = new VPWidenLoadRecipe(
*cast<LoadInst>(LoadGroup->getInterleaveGroup()->getInsertPos()),
LoadGroup->getAddr(), LoadGroup->getMask(), /*Consecutive=*/true,
/*Reverse=*/false, LoadGroup->getDebugLoc());
/*Reverse=*/false, {}, LoadGroup->getDebugLoc());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Propagate LoadInst's metadata?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done thanks!

@fhahn fhahn force-pushed the perf/vplan-noalias-md branch 2 times, most recently from e7a9297 to c0cb8e7 Compare May 4, 2025 20:16
fhahn added 2 commits May 7, 2025 22:07
Use VPIRMetadata added in llvm#135272
to also manage no-alias metadata added by versioning.

Note that this means we have to build the no-alias metadata up-front
once. If it is not used, it will be discarded automatically.

!fixup move applyMetadata;

!fixup address comments, thanks

!fixup address remaining comments, thanks

!fixup address latest comments, thanks
@fhahn fhahn force-pushed the perf/vplan-noalias-md branch from c0cb8e7 to 7935e7c Compare May 7, 2025 21:11
@@ -9742,7 +9738,7 @@ VPlanPtr LoopVectorizationPlanner::tryToBuildVPlan(VFRange &Range) {
// Collect mapping of IR header phis to header phi recipes, to be used in
// addScalarResumePhis.
VPRecipeBuilder RecipeBuilder(*Plan, OrigLoop, TLI, &TTI, Legal, CM, PSE,
Builder);
Builder, nullptr);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Builder, nullptr);
Builder, nullptr /* LVer */);

@@ -219,9 +219,9 @@ VPTransformState::VPTransformState(const TargetTransformInfo *TTI,
DominatorTree *DT, AssumptionCache *AC,
IRBuilderBase &Builder, VPlan *Plan,
Loop *CurrentParentLoop, Type *CanonicalIVTy)
: TTI(TTI), VF(VF), CFG(DT), LI(LI), AC(AC), Builder(Builder), Plan(Plan),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the removal of AC initialization related?

/// \p I and noalias metadata guaranteed by runtime checks using \p LVer.
VPIRMetadata(Instruction &I, LoopVersioning *LVer);

VPIRMetadata(const VPIRMetadata &Other) : Metadata(Other.Metadata) {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
VPIRMetadata(const VPIRMetadata &Other) : Metadata(Other.Metadata) {}
/// Copy constructor for cloning.
VPIRMetadata(const VPIRMetadata &Other) : Metadata(Other.Metadata) {}

@@ -2823,7 +2829,7 @@ void VPlanTransforms::narrowInterleaveGroups(VPlan &Plan, ElementCount VF,
auto *S = new VPWidenStoreRecipe(
*cast<StoreInst>(StoreGroup->getInterleaveGroup()->getInsertPos()),
StoreGroup->getAddr(), Res, nullptr, /*Consecutive=*/true,
/*Reverse=*/false, StoreGroup->getDebugLoc());
/*Reverse=*/false, {}, StoreGroup->getDebugLoc());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/*Reverse=*/false, {}, StoreGroup->getDebugLoc());
/*Reverse=*/false, VPIRMetadata(SI), StoreGroup->getDebugLoc());

similar to LI above?

@@ -219,9 +219,9 @@ VPTransformState::VPTransformState(const TargetTransformInfo *TTI,
DominatorTree *DT, AssumptionCache *AC,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DominatorTree *DT, AssumptionCache *AC,
DominatorTree *DT,

?

@@ -2588,7 +2601,6 @@ static void scalarizeInstruction(const Instruction *Instr,
InputLane = VPLane::getFirstLane();
Cloned->setOperand(I.index(), State.get(Operand, InputLane));
}
State.addNewMetadata(Cloned, Instr);

// Place the cloned scalar in the new loop.
State.Builder.Insert(Cloned);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 2600 below has State.AC->registerAssumption(II);, so State still needs AC?

Comment on lines +81 to +82
SmallVector<std::pair<unsigned, MDNode *>> Metadata;
::getMetadataToPropagate(Inst, Metadata);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still awaits commit? Defining Metadata here is now redundant.

// TODO: Handle converting to uniform recipes as separate transform,
// then cloning should be sufficient here.
Instruction *I = SinkCandidate->getUnderlyingInstr();
Clone = new VPReplicateRecipe(I, SinkCandidate->operands(), true);
Clone = new VPReplicateRecipe(I, SinkCandidate->operands(), true,
/*Mask*/ nullptr, *SinkCandidateRepR);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/*Mask*/ nullptr, *SinkCandidateRepR);
nullptr /*Mask*/, *SinkCandidateRepR);

@@ -345,7 +349,7 @@ static VPRegionBlock *createReplicateRegion(VPReplicateRecipe *PredRecipe,
auto *RecipeWithoutMask = new VPReplicateRecipe(
PredRecipe->getUnderlyingInstr(),
make_range(PredRecipe->op_begin(), std::prev(PredRecipe->op_end())),
PredRecipe->isUniform());
PredRecipe->isUniform(), /*Mask*/ nullptr, *PredRecipe);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
PredRecipe->isUniform(), /*Mask*/ nullptr, *PredRecipe);
PredRecipe->isUniform(), nullptr /*Mask*/, *PredRecipe);

and similarly below?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants