Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[WIP][CodeGen] Modifying MBB's liveins representation as into regUnits #129847

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

vg0204
Copy link
Contributor

@vg0204 vg0204 commented Mar 5, 2025

Currently, the machine basicblock does not fully utilizes the laneBitmask associated with physReg liveIns to check for the precise liveness. Conservatively, it acts fully correct now, only if all liveIns check for MBB is in form it defines it for itself.

So, now with the use of register units tracking for MBB's liveIns , its possible to track & check liveness for all sorts of physRegs.

@llvmbot
Copy link
Member

llvmbot commented Mar 5, 2025

@llvm/pr-subscribers-llvm-regalloc
@llvm/pr-subscribers-backend-x86
@llvm/pr-subscribers-tools-llvm-exegesis
@llvm/pr-subscribers-backend-aarch64

@llvm/pr-subscribers-backend-arm

Author: Vikash Gupta (vg0204)

Changes

Currently, the machine basicblock does not fully utilizes the laneBitmask associated with physReg liveIns to check for the precise liveness. Conservatively, it acts fully correct now, only if all liveIns check for MBB is in form it defines it for itself.

So, now with the use of register units tracking for MBB's liveIns , its possible to track & check liveness for all sorts of physRegs.

It is needed to handle #129848


Full diff: https://github.com/llvm/llvm-project/pull/129847.diff

3 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/MachineBasicBlock.h (+15)
  • (modified) llvm/lib/CodeGen/MachineBasicBlock.cpp (+29-5)
  • (modified) llvm/test/CodeGen/ARM/aes-erratum-fix.ll (+4-4)
diff --git a/llvm/include/llvm/CodeGen/MachineBasicBlock.h b/llvm/include/llvm/CodeGen/MachineBasicBlock.h
index 2de96fa85b936..ab88177b63fed 100644
--- a/llvm/include/llvm/CodeGen/MachineBasicBlock.h
+++ b/llvm/include/llvm/CodeGen/MachineBasicBlock.h
@@ -13,6 +13,7 @@
 #ifndef LLVM_CODEGEN_MACHINEBASICBLOCK_H
 #define LLVM_CODEGEN_MACHINEBASICBLOCK_H
 
+#include "llvm/ADT/BitVector.h"
 #include "llvm/ADT/DenseMapInfo.h"
 #include "llvm/ADT/GraphTraits.h"
 #include "llvm/ADT/SparseBitVector.h"
@@ -158,6 +159,7 @@ class MachineBasicBlock
 
   MachineFunction *xParent;
   Instructions Insts;
+  const TargetRegisterInfo *TRI;
 
   /// Keep track of the predecessor / successor basic blocks.
   SmallVector<MachineBasicBlock *, 4> Predecessors;
@@ -177,6 +179,10 @@ class MachineBasicBlock
   using LiveInVector = std::vector<RegisterMaskPair>;
   LiveInVector LiveIns;
 
+  /// Keeps track of live register units for those physical registers which
+  /// are livein of the basicblock.
+  BitVector LiveInRegUnits;
+
   /// Alignment of the basic block. One if the basic block does not need to be
   /// aligned.
   Align Alignment;
@@ -458,11 +464,17 @@ class MachineBasicBlock
   void addLiveIn(MCRegister PhysReg,
                  LaneBitmask LaneMask = LaneBitmask::getAll()) {
     LiveIns.push_back(RegisterMaskPair(PhysReg, LaneMask));
+    addLiveInRegUnit(PhysReg, LaneMask);
   }
   void addLiveIn(const RegisterMaskPair &RegMaskPair) {
     LiveIns.push_back(RegMaskPair);
+    addLiveInRegUnit(RegMaskPair.PhysReg, RegMaskPair.LaneMask);
   }
 
+  // Sets the register units for Reg based on the LaneMask in the
+  // LiveInRegUnits.
+  void addLiveInRegUnit(MCRegister Reg, LaneBitmask LaneMask);
+
   /// Sorts and uniques the LiveIns vector. It can be significantly faster to do
   /// this than repeatedly calling isLiveIn before calling addLiveIn for every
   /// LiveIn insertion.
@@ -484,6 +496,9 @@ class MachineBasicBlock
   void removeLiveIn(MCRegister Reg,
                     LaneBitmask LaneMask = LaneBitmask::getAll());
 
+  /// Resets the register units from LiveInRegUnits for the specified regsiters.
+  void removeLiveInRegUnit(MCRegister Reg);
+
   /// Return true if the specified register is in the live in set.
   bool isLiveIn(MCRegister Reg,
                 LaneBitmask LaneMask = LaneBitmask::getAll()) const;
diff --git a/llvm/lib/CodeGen/MachineBasicBlock.cpp b/llvm/lib/CodeGen/MachineBasicBlock.cpp
index b3a71d1144726..0e5055d7bec2c 100644
--- a/llvm/lib/CodeGen/MachineBasicBlock.cpp
+++ b/llvm/lib/CodeGen/MachineBasicBlock.cpp
@@ -35,6 +35,7 @@
 #include "llvm/IR/ModuleSlotTracker.h"
 #include "llvm/MC/MCAsmInfo.h"
 #include "llvm/MC/MCContext.h"
+#include "llvm/MC/MCRegisterInfo.h"
 #include "llvm/Support/Debug.h"
 #include "llvm/Support/raw_ostream.h"
 #include "llvm/Target/TargetMachine.h"
@@ -51,10 +52,12 @@ static cl::opt<bool> PrintSlotIndexes(
     cl::init(true), cl::Hidden);
 
 MachineBasicBlock::MachineBasicBlock(MachineFunction &MF, const BasicBlock *B)
-    : BB(B), Number(-1), xParent(&MF) {
+    : BB(B), Number(-1), xParent(&MF),
+      TRI(MF.getSubtarget().getRegisterInfo()) {
   Insts.Parent = this;
   if (B)
     IrrLoopHeaderWeight = B->getIrrLoopHeaderWeight();
+  LiveInRegUnits.resize(TRI->getNumRegUnits());
 }
 
 MachineBasicBlock::~MachineBasicBlock() = default;
@@ -597,6 +600,14 @@ void MachineBasicBlock::printAsOperand(raw_ostream &OS,
   printName(OS, 0);
 }
 
+void MachineBasicBlock::addLiveInRegUnit(MCRegister Reg, LaneBitmask LaneMask) {
+  for (MCRegUnitMaskIterator Unit(Reg, TRI); Unit.isValid(); ++Unit) {
+    LaneBitmask UnitMask = (*Unit).second;
+    if ((UnitMask & LaneMask).any())
+      LiveInRegUnits.set((*Unit).first);
+  }
+}
+
 void MachineBasicBlock::removeLiveIn(MCRegister Reg, LaneBitmask LaneMask) {
   LiveInVector::iterator I = find_if(
       LiveIns, [Reg](const RegisterMaskPair &LI) { return LI.PhysReg == Reg; });
@@ -604,21 +615,32 @@ void MachineBasicBlock::removeLiveIn(MCRegister Reg, LaneBitmask LaneMask) {
     return;
 
   I->LaneMask &= ~LaneMask;
-  if (I->LaneMask.none())
+  if (I->LaneMask.none()) {
     LiveIns.erase(I);
+    removeLiveInRegUnit(I->PhysReg);
+  }
 }
 
 MachineBasicBlock::livein_iterator
 MachineBasicBlock::removeLiveIn(MachineBasicBlock::livein_iterator I) {
   // Get non-const version of iterator.
   LiveInVector::iterator LI = LiveIns.begin() + (I - LiveIns.begin());
+  removeLiveInRegUnit(LI->PhysReg);
   return LiveIns.erase(LI);
 }
 
+void MachineBasicBlock::removeLiveInRegUnit(MCRegister Reg) {
+  for (MCRegUnit Unit : TRI->regunits(Reg))
+    LiveInRegUnits.reset(Unit);
+}
+
 bool MachineBasicBlock::isLiveIn(MCRegister Reg, LaneBitmask LaneMask) const {
-  livein_iterator I = find_if(
-      LiveIns, [Reg](const RegisterMaskPair &LI) { return LI.PhysReg == Reg; });
-  return I != livein_end() && (I->LaneMask & LaneMask).any();
+  for (MCRegUnitMaskIterator Unit(Reg, TRI); Unit.isValid(); ++Unit) {
+    LaneBitmask UnitMask = (*Unit).second;
+    if ((UnitMask & LaneMask).any() && LiveInRegUnits.test((*Unit).first))
+      return true;
+  }
+  return false;
 }
 
 void MachineBasicBlock::sortUniqueLiveIns() {
@@ -1751,12 +1773,14 @@ MachineBasicBlock::getEndClobberMask(const TargetRegisterInfo *TRI) const {
 
 void MachineBasicBlock::clearLiveIns() {
   LiveIns.clear();
+  LiveInRegUnits.reset();
 }
 
 void MachineBasicBlock::clearLiveIns(
     std::vector<RegisterMaskPair> &OldLiveIns) {
   assert(OldLiveIns.empty() && "Vector must be empty");
   std::swap(LiveIns, OldLiveIns);
+  LiveInRegUnits.reset();
 }
 
 MachineBasicBlock::livein_iterator MachineBasicBlock::livein_begin() const {
diff --git a/llvm/test/CodeGen/ARM/aes-erratum-fix.ll b/llvm/test/CodeGen/ARM/aes-erratum-fix.ll
index 82f5bfd02a56e..e1361d6efa780 100644
--- a/llvm/test/CodeGen/ARM/aes-erratum-fix.ll
+++ b/llvm/test/CodeGen/ARM/aes-erratum-fix.ll
@@ -68,8 +68,8 @@ define arm_aapcs_vfpcc void @aese_via_call2(half %0, ptr %1) nounwind {
 ; CHECK-FIX-NEXT:    push {r4, lr}
 ; CHECK-FIX-NEXT:    mov r4, r0
 ; CHECK-FIX-NEXT:    bl get_inputf16
-; CHECK-FIX-NEXT:    vorr q0, q0, q0
 ; CHECK-FIX-NEXT:    vld1.64 {d16, d17}, [r4]
+; CHECK-FIX-NEXT:    vorr q0, q0, q0
 ; CHECK-FIX-NEXT:    aese.8 q8, q0
 ; CHECK-FIX-NEXT:    aesmc.8 q8, q8
 ; CHECK-FIX-NEXT:    vst1.64 {d16, d17}, [r4]
@@ -89,8 +89,8 @@ define arm_aapcs_vfpcc void @aese_via_call3(float %0, ptr %1) nounwind {
 ; CHECK-FIX-NEXT:    push {r4, lr}
 ; CHECK-FIX-NEXT:    mov r4, r0
 ; CHECK-FIX-NEXT:    bl get_inputf32
-; CHECK-FIX-NEXT:    vorr q0, q0, q0
 ; CHECK-FIX-NEXT:    vld1.64 {d16, d17}, [r4]
+; CHECK-FIX-NEXT:    vorr q0, q0, q0
 ; CHECK-FIX-NEXT:    aese.8 q8, q0
 ; CHECK-FIX-NEXT:    aesmc.8 q8, q8
 ; CHECK-FIX-NEXT:    vst1.64 {d16, d17}, [r4]
@@ -2222,8 +2222,8 @@ define arm_aapcs_vfpcc void @aesd_via_call2(half %0, ptr %1) nounwind {
 ; CHECK-FIX-NEXT:    push {r4, lr}
 ; CHECK-FIX-NEXT:    mov r4, r0
 ; CHECK-FIX-NEXT:    bl get_inputf16
-; CHECK-FIX-NEXT:    vorr q0, q0, q0
 ; CHECK-FIX-NEXT:    vld1.64 {d16, d17}, [r4]
+; CHECK-FIX-NEXT:    vorr q0, q0, q0
 ; CHECK-FIX-NEXT:    aesd.8 q8, q0
 ; CHECK-FIX-NEXT:    aesimc.8 q8, q8
 ; CHECK-FIX-NEXT:    vst1.64 {d16, d17}, [r4]
@@ -2243,8 +2243,8 @@ define arm_aapcs_vfpcc void @aesd_via_call3(float %0, ptr %1) nounwind {
 ; CHECK-FIX-NEXT:    push {r4, lr}
 ; CHECK-FIX-NEXT:    mov r4, r0
 ; CHECK-FIX-NEXT:    bl get_inputf32
-; CHECK-FIX-NEXT:    vorr q0, q0, q0
 ; CHECK-FIX-NEXT:    vld1.64 {d16, d17}, [r4]
+; CHECK-FIX-NEXT:    vorr q0, q0, q0
 ; CHECK-FIX-NEXT:    aesd.8 q8, q0
 ; CHECK-FIX-NEXT:    aesimc.8 q8, q8
 ; CHECK-FIX-NEXT:    vst1.64 {d16, d17}, [r4]

Copy link
Member

@lenary lenary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arm test changes look correct. I haven't checked the implementation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes to this file look correct - the vorr qN, qN, qN is necessary and has to be done to the input of the aes instruction pair, which it is still.

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing most of the changes necessary to do this. We need a full migration away from the existing storage as reg + lane mask, to the register units. This requires MIR print and parsing changes, and I'm sure there will be downstream fallout in use sites

@vg0204
Copy link
Contributor Author

vg0204 commented Mar 7, 2025

This is missing most of the changes necessary to do this. We need a full migration away from the existing storage as reg + lane mask, to the register units. This requires MIR print and parsing changes, and I'm sure there will be downstream fallout in use sites

So, should I start this as separate task entirely? Also, don't we need MCphysReg mandatorily as part of liveIn representation too, as that is used entirely all over the llvm codegen, as regunits alone cannot hold that sense? Once we go into realm of regUnits, it would not be possible to know their exact source MCPhysReg, if not tracked parallely.

Doesn't something like reg + liveRegUnit is better, as its still technically as same as reg + lane. Also, now rather than maintininng list of liveIn regs, we can maintain a set of all liveIn regUnits (basically union of liveRegUnit for all regs in MBB)

@vg0204
Copy link
Contributor Author

vg0204 commented Mar 10, 2025

Also, the liveIn registers represent physReg which need not be entirely live into MBB (so the laneMask makes sense), but migrating entirely into regUnits (representing only live parts of liveIn physRegs) could cause severe fallout as you mentioned at use sites. So, we strongly cannot eliminate the register concept as liveIn of MBB I fee!

Could you through some light on it @arsenm!

@vg0204
Copy link
Contributor Author

vg0204 commented Mar 17, 2025

KInd Ping @arsenm !

@vg0204
Copy link
Contributor Author

vg0204 commented Mar 25, 2025

@arsenm can you comments on my thought (the latest comments) regarding the usage of both LiveRegUnits & LiveInList hand-in-hand, so I can proceed with it accordingly!

cc : @jayfoad , @MatzeB

@arsenm
Copy link
Contributor

arsenm commented Mar 25, 2025

This patch is about 5% of the changes required. This requires fully removing the existing block tracking in terms of physical registers. As a first step, I think you should address introduction of LiveRegUnitMasks. We need a new MachineOperand type, a MIR section to print a compacted regunit mask with an ID number for referencing in the instruction stream (similar to how frame indexes work).

@vg0204
Copy link
Contributor Author

vg0204 commented Apr 1, 2025

This patch is about 5% of the changes required. This requires fully removing the existing block tracking in terms of physical registers. As a first step, I think you should address introduction of LiveRegUnitMasks. We need a new MachineOperand type, a MIR section to print a compacted regunit mask with an ID number for referencing in the instruction stream (similar to how frame indexes work).

Just a brief of things I need to do as per my study (please let me know if anything is misunderstood) :

  1. Need to have a list of LaneBitmasks in MF (or in MFI, where it makes more sense ??) used & its corresponding YAML serialization object inr MIRYamlMapping. Needed both MIRParser & MIRPrinter logics for it respectively.
  2. Defining lexing and parsing for LaneBitmask identifier token object respectively in MILexer & MIParser.
  3. Finally need a new index MO to track & reference the LaneBitmask objects in the MIs.

Did I miss anything @arsenm ?

@arsenm
Copy link
Contributor

arsenm commented Apr 1, 2025

I guess those points are related work we need to move all infrastructure to work on register units. If you're just talking about changing the block live in representation, we can probably get away without it.

We do need MIR print/parse changes for the liveins in terms of register units. It may or may not make sense to share code with regunit masks for that.

@vg0204
Copy link
Contributor Author

vg0204 commented Apr 1, 2025

We need a new MachineOperand type, a MIR section to print a compacted regunit mask with an ID number for referencing in the instruction stream (similar to how frame indexes work).

If we want to achieve something like this, I am unable to see how can we get away from all steps mentioned, as I deduced those after back-tracking the usage of ID number referencing for FI in the instructions.

Secondly, liveins as the regunit mask in someway (alongwith physReg) & representing them completely as register units apparently seems same but are 2 things as in the latter, the concept of liveins as physReg will vanish irreversibly causing significant downfall at use-site of liveins()

@vg0204
Copy link
Contributor Author

vg0204 commented Apr 1, 2025

We do need MIR print/parse changes for the liveins in terms of register units. It may or may not make sense to share code with regunit masks for that.

This seems not difficult to do once the clarity on livein representation has been reached!

@vg0204
Copy link
Contributor Author

vg0204 commented Apr 3, 2025

Gentle ping @arsenm !

@vg0204
Copy link
Contributor Author

vg0204 commented Apr 8, 2025

@arsenm , any final thoughts & suggestions on my doubts ?

@vg0204
Copy link
Contributor Author

vg0204 commented Apr 14, 2025

We do need MIR print/parse changes for the liveins in terms of register units. It may or may not make sense to share code with regunit masks for that.

@arsenm don't you think it's as a big change (from its impact perspective, even not if implementation) that it can break everything, as unlike just introduction of optional laneBitmask to physeg, as supported now, is an extension which supported backward compatability, but this will need compulsory changes in every test files for all archs, as we are supposedly completely stripping physReg concept, replacing with regunits as block liveins , so what about this?

Secondly, liveins as the regunit mask in someway (alongwith physReg) & representing them completely as register units apparently seems same but are 2 things as in the latter, the concept of liveins as physReg will vanish irreversibly causing significant downfall at use-site of liveins()

@vg0204
Copy link
Contributor Author

vg0204 commented Apr 28, 2025

@arsenm don't you think it's as a big change (from its impact perspective, even not if implementation) that it can break everything, as unlike just introduction of optional laneBitmask to physeg, as supported now, is an extension which supported backward compatability, but this will need compulsory changes in every test files for all archs, as we are supposedly completely stripping physReg concept, replacing with regunits as block liveins , so what about this?

Secondly, liveins as the regunit mask in someway (alongwith physReg) & representing them completely as register units apparently seems same but are 2 things as in the latter, the concept of liveins as physReg will vanish irreversibly causing significant downfall at use-site of liveins()

Hey @arsenm, if you could spare sometime and comment on this, it would be great!

@vg0204
Copy link
Contributor Author

vg0204 commented Apr 29, 2025

Kind Ping @jayfoad @MatzeB !

@vg0204
Copy link
Contributor Author

vg0204 commented Apr 29, 2025

Could one solution be : We can store only the live regunits (for either of its root) as MCPhysReg in LiveInList using the its attached laneBitMask value provided (or entire Reg in case laneBitmask is set as All), rather than storing the laneBitMask as part of the LiveIn. For Example :

liveins: $vgpr0_vgpr1:0x00003, $vgpr4, $vgpr5:0x000001

will be transformed into :

liveins: $vgpr0_lo16, $vgpr0_hi16, $vgpr4, $vgpr5_hi16

Does it make sense, along the line of my doubts just mentioned above?

@arsenm
Copy link
Contributor

arsenm commented Apr 29, 2025

@arsenm don't you think it's as a big change (from its impact perspective, even not if implementation) that it can break everything, as unlike just introduction of optional laneBitmask to physeg, as supported now, is an extension which supported backward compatability, but this will need compulsory changes in every test files for all archs, as we are supposedly completely stripping physReg concept, replacing with regunits as block liveins , so what about this?

There is no compatibility concern, and every target will need to be updated. Just introducing a parallel structure with the same purpose is actively harmful. This is a full migration effort, I don't see how it makes sense for both livein representations to coexist. It will only provide a new source of inconsistency bugs and waste compile time maintaining both cases.

Secondly, liveins as the regunit mask in someway (alongwith physReg) & representing them completely as register units apparently seems same but are 2 things as in the latter, the concept of liveins as physReg will vanish irreversibly causing significant downfall at use-site of liveins()

What do you mean by "significant downfall"? Conceptually I don't see why you should need to enumerate the live in registers frequently. Most uses care about whether the register is live or not

will be transformed into :
liveins: $vgpr0_lo16, $vgpr0_hi16, $vgpr4, $vgpr5_hi16
Does it make sense, along the line of my doubts just mentioned above?

This is the whole point, re-encode the live-ins in terms of register units

@vg0204
Copy link
Contributor Author

vg0204 commented May 7, 2025

@jayfoad , should the issue #138552 need to be resolved completely before this works can be taken forward or can we check for regunit with laneBitmask as 0x0 to understand its a fake (or shared regunit) & setting all its bit as suggested by you in #78942. For each regUnit, in registers with provided LaneMask

(UnitMask & LaneMask).any()) --> (UnitMask.none() || (UnitMask & LaneMask).any())

@vg0204
Copy link
Contributor Author

vg0204 commented May 7, 2025

Also, around 2.2k LIT test need to be updated spanned across multiple archs, so what would be the right way to handle it in terms of PRs, as never dealt with such magnitude of changes before. Also, if anyway to automate it (for all kinds of lit tests), it would be appreciated!

@vg0204 vg0204 requested a review from qcolombet May 7, 2025 06:59
@jayfoad
Copy link
Contributor

jayfoad commented May 7, 2025

should the issue #138552 need to be resolved completely before this works can be taken forward

Yes, I think we need to give leaf registers their own unique regunit, even in the presence of ad hoc aliases.

In the VE case, SX0 has two subregs SW0 and SF0 which alias each other. Currently they all have the same single regunit (number 26) so a regnit-based representation can't even distinguish a superreg from its subregs.

cc @qcolombet

@qcolombet
Copy link
Collaborator

should the issue #138552 need to be resolved completely before this works can be taken forward

Yes, I think we need to give leaf registers their own unique regunit, even in the presence of ad hoc aliases.

In the VE case, SX0 has two subregs SW0 and SF0 which alias each other. Currently they all have the same single regunit (number 26) so a regnit-based representation can't even distinguish a superreg from its subregs.

cc @qcolombet

I haven't read carefully the whole thread, but I agree with this.

@qcolombet
Copy link
Collaborator

Also, side question, what is the point of changing the liveins representation?
What does it solve/allow?

@vg0204
Copy link
Contributor Author

vg0204 commented May 7, 2025

Also, side question, what is the point of changing the liveins representation?
What does it solve/allow?

I changed it as tracking of live-ins as regUnits (which are constant in number) rather than registers are quite less expensive. As the number of registers in arch like ourselves can go up as 9k due to aliasing & RegisterTuple stuff, which can be avoided due to exact nature of it

@vg0204
Copy link
Contributor Author

vg0204 commented May 7, 2025

Yes, I think we need to give leaf registers their own unique regunit, even in the presence of ad hoc aliases.

As, I looked I think we need to make changes here to handle it :

// Initialize RegUnitList. Because getSubRegs is called recursively, this

So, anyone taking care of it as of now, because its needed to be done for my work to move forward?

@jayfoad
Copy link
Contributor

jayfoad commented May 7, 2025

So, anyone taking care of it as of now

No-one is working on it, as far as I know.

@vg0204
Copy link
Contributor Author

vg0204 commented May 7, 2025

So, anyone taking care of it as of now

No-one is working on it, as far as I know.

Its quiet easy to achieve from implemetation point of view (cannot say the same if its have conceptually implication at big). So, I am just in the process of trying it locally.

@arsenm
Copy link
Contributor

arsenm commented May 7, 2025

Also, side question, what is the point of changing the liveins representation? What does it solve/allow?

It's confusing and unclear how to interpret mixed aliasing register entries with the lanemasks. There's also a compile time cost every time we need to translate between the physical register representation and the actual liveness tracking in regunits. Since these are purely used for liveness, they should always be in terms of regunits

@vg0204
Copy link
Contributor Author

vg0204 commented May 7, 2025

So, anyone taking care of it as of now

No-one is working on it, as far as I know.

Its quiet easy to achieve from implemetation point of view (cannot say the same if its have conceptually implication at big). So, I am just in the process of trying it locally.

It worked as expected, should I proceed with it as a PR for review @jayfoad !

@jayfoad
Copy link
Contributor

jayfoad commented May 8, 2025

So, anyone taking care of it as of now

No-one is working on it, as far as I know.

Its quiet easy to achieve from implemetation point of view (cannot say the same if its have conceptually implication at big). So, I am just in the process of trying it locally.

It worked as expected, should I proceed with it as a PR for review @jayfoad !

Sure.

vg0204 added 3 commits May 15, 2025 05:14
Currently, the machine basicblock does not fully utilizes the
laneBitmask associated with physReg liveIns to check for the
precise liveness. Conservatively, it acts fully correct now, only
if all liveIns check for MBB is in form it defines it for itself.

So, now with the use of register units tracking for MBB's liveIns
, its possible to track & check liveness for all sorts of physRegs.
@vg0204 vg0204 force-pushed the vg0204/add-liveRegUnit-compute-in-MBB branch from 4b91b83 to a1b695d Compare May 15, 2025 06:43
Copy link

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff HEAD~1 HEAD --extensions h,cpp -- llvm/include/llvm/CodeGen/LivePhysRegs.h llvm/include/llvm/CodeGen/MachineBasicBlock.h llvm/lib/CodeGen/AggressiveAntiDepBreaker.cpp llvm/lib/CodeGen/BranchFolding.cpp llvm/lib/CodeGen/BranchRelaxation.cpp llvm/lib/CodeGen/CriticalAntiDepBreaker.cpp llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp llvm/lib/CodeGen/LiveIntervals.cpp llvm/lib/CodeGen/LivePhysRegs.cpp llvm/lib/CodeGen/LiveRegUnits.cpp llvm/lib/CodeGen/LiveVariables.cpp llvm/lib/CodeGen/MIRPrinter.cpp llvm/lib/CodeGen/MachineBasicBlock.cpp llvm/lib/CodeGen/MachineCopyPropagation.cpp llvm/lib/CodeGen/MachineLICM.cpp llvm/lib/CodeGen/MachineSink.cpp llvm/lib/CodeGen/MachineVerifier.cpp llvm/lib/CodeGen/PrologEpilogInserter.cpp llvm/lib/CodeGen/RDFGraph.cpp llvm/lib/CodeGen/RDFLiveness.cpp llvm/lib/CodeGen/ReachingDefAnalysis.cpp llvm/lib/CodeGen/RegAllocFast.cpp llvm/lib/CodeGen/ScheduleDAGInstrs.cpp llvm/lib/CodeGen/ShrinkWrap.cpp llvm/lib/CodeGen/VirtRegMap.cpp llvm/lib/Target/AArch64/AArch64CollectLOH.cpp llvm/lib/Target/AArch64/AArch64FrameLowering.cpp llvm/lib/Target/AMDGPU/SIFrameLowering.cpp llvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp llvm/lib/Target/ARM/ARMConstantIslandPass.cpp llvm/lib/Target/ARM/ARMFrameLowering.cpp llvm/lib/Target/ARM/ARMLowOverheadLoops.cpp llvm/lib/Target/ARM/ThumbRegisterInfo.cpp llvm/lib/Target/AVR/AVRFrameLowering.cpp llvm/lib/Target/Hexagon/HexagonBlockRanges.cpp llvm/lib/Target/Hexagon/HexagonCFGOptimizer.cpp llvm/lib/Target/M68k/M68kFrameLowering.cpp llvm/lib/Target/Mips/MipsDelaySlotFiller.cpp llvm/lib/Target/X86/X86FloatingPoint.cpp llvm/lib/Target/X86/X86FrameLowering.cpp llvm/tools/llvm-exegesis/lib/SnippetRepetitor.cpp llvm/unittests/tools/llvm-exegesis/X86/SnippetRepetitorTest.cpp
View the diff from clang-format here.
diff --git a/llvm/lib/CodeGen/MachineBasicBlock.cpp b/llvm/lib/CodeGen/MachineBasicBlock.cpp
index 546733d3e..57b9d3960 100644
--- a/llvm/lib/CodeGen/MachineBasicBlock.cpp
+++ b/llvm/lib/CodeGen/MachineBasicBlock.cpp
@@ -636,7 +636,7 @@ MachineBasicBlock::removeLiveIn(MachineBasicBlock::livein_iterator I) {
 }
 
 bool MachineBasicBlock::isLiveIn(MCRegister Reg, LaneBitmask LaneMask) const {
-  if(!Reg.isPhysical())
+  if (!Reg.isPhysical())
     return false;
 
   const TargetRegisterInfo *TRI = getParent()->getSubtarget().getRegisterInfo();
diff --git a/llvm/lib/CodeGen/MachineVerifier.cpp b/llvm/lib/CodeGen/MachineVerifier.cpp
index 206b7cf5d..5f6c45247 100644
--- a/llvm/lib/CodeGen/MachineVerifier.cpp
+++ b/llvm/lib/CodeGen/MachineVerifier.cpp
@@ -160,7 +160,7 @@ struct MachineVerifier {
     if (Reg.isPhysical()) {
       for (MCRegUnit Unit : TRI->regunits(Reg.asMCReg()))
         for (MCRegUnitRootIterator RootReg(Unit, TRI); RootReg.isValid();
-            ++RootReg)
+             ++RootReg)
           RV.push_back(*RootReg);
     }
   }

@vg0204
Copy link
Contributor Author

vg0204 commented May 16, 2025

@jayfoad , @arsenm , can you review this, as now only about 2k LIT tests (all across different targets) needs to be updated (as all having ExitCode : 1). How should do we think they should be handled?

Copy link
Contributor

@jayfoad jayfoad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I quite like the idea of tracking regunit roots instead of regunits themselves. At least it makes this a smaller incremental change.

@@ -13,7 +13,9 @@
#ifndef LLVM_CODEGEN_MACHINEBASICBLOCK_H
#define LLVM_CODEGEN_MACHINEBASICBLOCK_H

#include "llvm/ADT/BitVector.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need this

@@ -174,8 +176,7 @@ class MachineBasicBlock
std::optional<uint64_t> IrrLoopHeaderWeight;

/// Keep track of the physical registers that are livein of the basicblock.
using LiveInVector = std::vector<RegisterMaskPair>;
LiveInVector LiveIns;
DenseSet<MCRegister> LiveIns;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it's a good idea to change from std::vector to DenseSet. The patch would be less invasive if you just changed it to std:vector<MCRegister>

You should at least benchmark it using http://llvm-compile-time-tracker.com/. You can do this by asking @nikic to add your fork of LLVM, and then any branches you create called "perf/*" will automatically get benchmarked.

@@ -88,7 +88,7 @@ void LiveRegUnits::accumulate(const MachineInstr &MI) {
static void addBlockLiveIns(LiveRegUnits &LiveUnits,
const MachineBasicBlock &MBB) {
for (const auto &LI : MBB.liveins())
LiveUnits.addRegMasked(LI.PhysReg, LI.LaneMask);
LiveUnits.addRegMasked(LI, LaneBitmask::getAll());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Further simplification is possible here, since this is the only call to addRegMasked so it no longer needs a Mask argument.

OS << LS << printReg(LI.PhysReg, &TRI);
if (!LI.LaneMask.all())
OS << ":0x" << PrintLaneMask(LI.LaneMask);
bool First = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should continue to use LS instead of First. You only need to remove the part that prints the lanemask.

Comment on lines +71 to +72
Classes[Reg] = reinterpret_cast<TargetRegisterClass *>(-1);
KillIndices[Reg] = BBSize;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't change the Reg.id() part. It just makes the patch bigger and harder to review.

OS << LS << printReg(LI.PhysReg, TRI);
if (!LI.LaneMask.all())
OS << ":0x" << PrintLaneMask(LI.LaneMask);
for (const MCRegister Reg : liveins()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the const. (Pointers to const and const references are good, but plain const is not.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another reason not to use DenseMap: the iteration order here will be undefined, so we will get spurious differences in MIR output printing.

LaneMask |= J->LaneMask;
Out->PhysReg = PhysReg;
Out->LaneMask = LaneMask;
if(!Reg.isPhysical())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be an assert

Comment on lines +627 to +635
if (I == LiveIns.end())
return I;

DenseSet<MCRegister>::iterator start = LiveIns.begin();
while (start != I)
start++;
MachineBasicBlock::livein_iterator next = start;
LiveIns.erase(start);
return next++;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty horrible and I'm not sure it works.

Either leave liveins as a std::vector, in which case this code does not need to change.

Or remove this method altogether. It is only used in one place, in X86FloatingPoint, and that code can ber rewritten to do this in some other way.

}

Register
MachineBasicBlock::addLiveIn(MCRegister PhysReg, const TargetRegisterClass *RC) {
assert(getParent() && "MBB must be inserted in function");
assert(PhysReg.isPhysical() && "Expected physreg");
assert(PhysReg && "Expected physreg");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you change this?

@jayfoad
Copy link
Contributor

jayfoad commented May 16, 2025

can you review this, as now only about 2k LIT tests (all across different targets) needs to be updated

2000 is still a lot! We need to understand which ones only fail because the way liveins are printed in MIR has changed (these ones can simply be updated) and which ones actually have different codegen, or fail in some other way.

@jayfoad
Copy link
Contributor

jayfoad commented May 16, 2025

There are genuine codegen differences in:

test/CodeGen/AArch64/zero-call-used-regs.ll
test/CodeGen/X86/zero-call-used-regs.ll

Looks like PEIImpl::insertZeroCallUsedRegs will need updating to understand that liveins() may only contain subregs of the registers that it is querying.

@jayfoad
Copy link
Contributor

jayfoad commented May 16, 2025

Rebasing on #140248 should fix one of 2000 failures.

@jayfoad
Copy link
Contributor

jayfoad commented May 16, 2025

There are codegen diffs in test/CodeGen/Thumb2/LowOverheadLoops/. It looks like this part of LowOverheadLoop::ValidateLiveOuts does not cope well with subregs in liveins():

for (const MachineBasicBlock::RegisterMaskPair &RegMask : ExitBB->liveins()) {

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants