Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[SystemZ] Add proper mcount handling #135767

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 2, 2025
Merged

Conversation

dominik-steenken
Copy link
Contributor

@dominik-steenken dominik-steenken commented Apr 15, 2025

When compiling with -pg, the EntryExitInstrumenterPass will insert calls to the glibc function mcount at the begining of each MachineFunction.

On SystemZ, these calls require special handling:

  • The call to mcount needs to happen at the beginning of the prologue.
  • Prior to the call to mcount, register %r14, the return address of the callee function, must be stored 8 bytes above the stack pointer %r15. After the call to mcount returns, that register needs to be restored.

This commit adds some special handling to the EntryExitInstrumenterPass that keeps the insertion of the mcount function into the module, but skips over insertion of the actual call in order to perform this insertion in the emitPrologue function. There, a simple sequence of store/call/load is inserted, which implements the above.

The desired change in the EntryExitInstrumenterPass necessitated the addition of a new attribute and attribute kind to each function, which is used to trigger the postprocessing, aka call insertion, in emitPrologue. Note that the new attribute must be of a different kind than the mcount atribute, since otherwise it would replace that attribute and later be deleted by the code that intended to delete mcount. The new attribnute is called insert-mcount, while the attribute kind is systemz-backend, to clearly mark it as a SystemZ-specific backend concern.

This PR should address issue #121137 . The test inserted here is derived from the example given in that issue.

@llvmbot
Copy link
Member

llvmbot commented Apr 15, 2025

@llvm/pr-subscribers-backend-systemz

Author: Dominik Steenken (dominik-steenken)

Changes

When compiling with -pg, the EntryExitInstrumenterPass will insert calls to the glibc function mcount at the begining of each MachineFunction.

On SystemZ, these calls require special handling:

  • The call to mcount needs to happen at the beginning of the prologue.
  • Prior to the call to mcount, register %r14, the return address of the callee function, must be stored 8 bytes above the stack pointer %r15. After the call to mcount returns, that register needs to be restored.

This commit adds some special handling to the EntryExitInstrumenterPass that keeps the insertion of the mcount function into the module, but skips over insertion of the actual call in order to perform this insertion in the emitPrologue function. There, a simple sequence of store/call/load is inserted, which implements the above.

Noe that the desired change in the EntryExitInstrumenterPass necessitated a change to the signature of the insertCall function, since there now is a possibility that no call is inserted, and thus that needs to be communicated to the caller. We can't ignore that fact either, because we need Changed to be false in the caller for SystemZ, so we can use it in a guard to prevent the deletion of the mcount attribute from the function, which we still need for our postprocessing, aka call insertion, in emitPrologue.

This PR should address issue #121137 . The test inserted here is derived from the example given in that issue.


Full diff: https://github.com/llvm/llvm-project/pull/135767.diff

3 Files Affected:

  • (modified) llvm/lib/Target/SystemZ/SystemZFrameLowering.cpp (+28)
  • (modified) llvm/lib/Transforms/Utils/EntryExitInstrumenter.cpp (+10-6)
  • (added) llvm/test/CodeGen/SystemZ/mcount.ll (+35)
diff --git a/llvm/lib/Target/SystemZ/SystemZFrameLowering.cpp b/llvm/lib/Target/SystemZ/SystemZFrameLowering.cpp
index 9561ea544b270..0936690dafdcd 100644
--- a/llvm/lib/Target/SystemZ/SystemZFrameLowering.cpp
+++ b/llvm/lib/Target/SystemZ/SystemZFrameLowering.cpp
@@ -7,6 +7,7 @@
 //===----------------------------------------------------------------------===//
 
 #include "SystemZFrameLowering.h"
+#include "MCTargetDesc/SystemZMCTargetDesc.h"
 #include "SystemZCallingConv.h"
 #include "SystemZInstrInfo.h"
 #include "SystemZMachineFunctionInfo.h"
@@ -18,6 +19,8 @@
 #include "llvm/CodeGen/RegisterScavenging.h"
 #include "llvm/CodeGen/TargetLoweringObjectFileImpl.h"
 #include "llvm/IR/Function.h"
+#include "llvm/IR/GlobalVariable.h"
+#include "llvm/IR/Module.h"
 #include "llvm/Target/TargetMachine.h"
 
 using namespace llvm;
@@ -558,6 +561,31 @@ void SystemZELFFrameLowering::emitPrologue(MachineFunction &MF,
   // to determine the end of the prologue.
   DebugLoc DL;
 
+  // Add mcount instrumentation if necessary.
+  if (MF.getFunction().getFnAttribute("instrument-function-entry-inlined").getValueAsString() == "mcount") {
+
+    // Store return address 8 bytes above stack pointer.
+    BuildMI(MBB, MBBI, DL, ZII->get(SystemZ::STG))
+        .addReg(SystemZ::R14D)
+        .addReg(SystemZ::R15D)
+        .addImm(8)
+        .addReg(0);
+
+    // Call mcount (Regmask 0 to ensure this will not be moved by the
+    // scheduler.).
+    const uint32_t Mask = 0;
+    BuildMI(MBB, MBBI, DL, ZII->get(SystemZ::CallBRASL))
+        .addGlobalAddress(MF.getFunction().getParent()->getFunction("mcount"))
+        .addRegMask(&Mask);
+
+    // Reload return address drom 8 bytes above stack pointer.
+    BuildMI(MBB, MBBI, DL, ZII->get(SystemZ::LG))
+      .addReg(SystemZ::R14D)
+      .addReg(SystemZ::R15D)
+      .addImm(8)
+      .addReg(0);
+  }
+
   // The current offset of the stack pointer from the CFA.
   int64_t SPOffsetFromCFA = -SystemZMC::ELFCFAOffsetFromInitialSP;
 
diff --git a/llvm/lib/Transforms/Utils/EntryExitInstrumenter.cpp b/llvm/lib/Transforms/Utils/EntryExitInstrumenter.cpp
index d47f1b4253b54..72e4f8791a91a 100644
--- a/llvm/lib/Transforms/Utils/EntryExitInstrumenter.cpp
+++ b/llvm/lib/Transforms/Utils/EntryExitInstrumenter.cpp
@@ -22,7 +22,7 @@
 
 using namespace llvm;
 
-static void insertCall(Function &CurFn, StringRef Func,
+static bool insertCall(Function &CurFn, StringRef Func,
                        BasicBlock::iterator InsertionPt, DebugLoc DL) {
   Module &M = *InsertionPt->getParent()->getParent()->getParent();
   LLVMContext &C = InsertionPt->getParent()->getContext();
@@ -63,12 +63,16 @@ static void insertCall(Function &CurFn, StringRef Func,
                                   false));
       CallInst *Call = CallInst::Create(Fn, RetAddr, "", InsertionPt);
       Call->setDebugLoc(DL);
+    } else if (TargetTriple.isSystemZ()) {
+      M.getOrInsertFunction(Func, Type::getVoidTy(C));
+      // skip insertion for `mcount` on SystemZ. This will be handled later in `emitPrologue`.
+      return false;
     } else {
       FunctionCallee Fn = M.getOrInsertFunction(Func, Type::getVoidTy(C));
       CallInst *Call = CallInst::Create(Fn, "", InsertionPt);
       Call->setDebugLoc(DL);
     }
-    return;
+    return true;
   }
 
   if (Func == "__cyg_profile_func_enter" || Func == "__cyg_profile_func_exit") {
@@ -87,7 +91,7 @@ static void insertCall(Function &CurFn, StringRef Func,
     CallInst *Call =
         CallInst::Create(Fn, ArrayRef<Value *>(Args), "", InsertionPt);
     Call->setDebugLoc(DL);
-    return;
+    return true;
   }
 
   // We only know how to call a fixed set of instrumentation functions, because
@@ -129,9 +133,9 @@ static bool runOnFunction(Function &F, bool PostInlining) {
     if (auto SP = F.getSubprogram())
       DL = DILocation::get(SP->getContext(), SP->getScopeLine(), 0, SP);
 
-    insertCall(F, EntryFunc, F.begin()->getFirstInsertionPt(), DL);
-    Changed = true;
-    F.removeFnAttr(EntryAttr);
+    Changed = insertCall(F, EntryFunc, F.begin()->getFirstInsertionPt(), DL);
+    if (Changed)
+      F.removeFnAttr(EntryAttr);
   }
 
   if (!ExitFunc.empty()) {
diff --git a/llvm/test/CodeGen/SystemZ/mcount.ll b/llvm/test/CodeGen/SystemZ/mcount.ll
new file mode 100644
index 0000000000000..01bd34548f125
--- /dev/null
+++ b/llvm/test/CodeGen/SystemZ/mcount.ll
@@ -0,0 +1,35 @@
+; Test proper insertion of mcount instrumentation
+;
+; RUN: llc < %s -mtriple=s390x-linux-gnu -o - | FileCheck %s
+;
+; CHECK: # %bb.0:
+; CHECK-NEXT: stg %r14, 8(%r15)
+; CHECK-NEXT: brasl %r14, mcount@PLT
+; CHECK-NEXT: lg %r14, 8(%r15)
+define dso_local signext i32 @fib(i32 noundef signext %n) #0 {
+entry:
+  %n.addr = alloca i32, align 4
+  store i32 %n, ptr %n.addr, align 4
+  %0 = load i32, ptr %n.addr, align 4
+  %cmp = icmp sle i32 %0, 1
+  br i1 %cmp, label %cond.true, label %cond.false
+
+cond.true:                                        ; preds = %entry
+  br label %cond.end
+
+cond.false:                                       ; preds = %entry
+  %1 = load i32, ptr %n.addr, align 4
+  %sub = sub nsw i32 %1, 1
+  %call = call signext i32 @fib(i32 noundef signext %sub)
+  %2 = load i32, ptr %n.addr, align 4
+  %sub1 = sub nsw i32 %2, 2
+  %call2 = call signext i32 @fib(i32 noundef signext %sub1)
+  %add = add nsw i32 %call, %call2
+  br label %cond.end
+
+cond.end:                                         ; preds = %cond.false, %cond.true
+  %cond = phi i32 [ 1, %cond.true ], [ %add, %cond.false ]
+  ret i32 %cond
+}
+
+attributes #0 = { "instrument-function-entry-inlined"="mcount" }

@llvmbot
Copy link
Member

llvmbot commented Apr 15, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Dominik Steenken (dominik-steenken)

Changes

When compiling with -pg, the EntryExitInstrumenterPass will insert calls to the glibc function mcount at the begining of each MachineFunction.

On SystemZ, these calls require special handling:

  • The call to mcount needs to happen at the beginning of the prologue.
  • Prior to the call to mcount, register %r14, the return address of the callee function, must be stored 8 bytes above the stack pointer %r15. After the call to mcount returns, that register needs to be restored.

This commit adds some special handling to the EntryExitInstrumenterPass that keeps the insertion of the mcount function into the module, but skips over insertion of the actual call in order to perform this insertion in the emitPrologue function. There, a simple sequence of store/call/load is inserted, which implements the above.

Noe that the desired change in the EntryExitInstrumenterPass necessitated a change to the signature of the insertCall function, since there now is a possibility that no call is inserted, and thus that needs to be communicated to the caller. We can't ignore that fact either, because we need Changed to be false in the caller for SystemZ, so we can use it in a guard to prevent the deletion of the mcount attribute from the function, which we still need for our postprocessing, aka call insertion, in emitPrologue.

This PR should address issue #121137 . The test inserted here is derived from the example given in that issue.


Full diff: https://github.com/llvm/llvm-project/pull/135767.diff

3 Files Affected:

  • (modified) llvm/lib/Target/SystemZ/SystemZFrameLowering.cpp (+28)
  • (modified) llvm/lib/Transforms/Utils/EntryExitInstrumenter.cpp (+10-6)
  • (added) llvm/test/CodeGen/SystemZ/mcount.ll (+35)
diff --git a/llvm/lib/Target/SystemZ/SystemZFrameLowering.cpp b/llvm/lib/Target/SystemZ/SystemZFrameLowering.cpp
index 9561ea544b270..0936690dafdcd 100644
--- a/llvm/lib/Target/SystemZ/SystemZFrameLowering.cpp
+++ b/llvm/lib/Target/SystemZ/SystemZFrameLowering.cpp
@@ -7,6 +7,7 @@
 //===----------------------------------------------------------------------===//
 
 #include "SystemZFrameLowering.h"
+#include "MCTargetDesc/SystemZMCTargetDesc.h"
 #include "SystemZCallingConv.h"
 #include "SystemZInstrInfo.h"
 #include "SystemZMachineFunctionInfo.h"
@@ -18,6 +19,8 @@
 #include "llvm/CodeGen/RegisterScavenging.h"
 #include "llvm/CodeGen/TargetLoweringObjectFileImpl.h"
 #include "llvm/IR/Function.h"
+#include "llvm/IR/GlobalVariable.h"
+#include "llvm/IR/Module.h"
 #include "llvm/Target/TargetMachine.h"
 
 using namespace llvm;
@@ -558,6 +561,31 @@ void SystemZELFFrameLowering::emitPrologue(MachineFunction &MF,
   // to determine the end of the prologue.
   DebugLoc DL;
 
+  // Add mcount instrumentation if necessary.
+  if (MF.getFunction().getFnAttribute("instrument-function-entry-inlined").getValueAsString() == "mcount") {
+
+    // Store return address 8 bytes above stack pointer.
+    BuildMI(MBB, MBBI, DL, ZII->get(SystemZ::STG))
+        .addReg(SystemZ::R14D)
+        .addReg(SystemZ::R15D)
+        .addImm(8)
+        .addReg(0);
+
+    // Call mcount (Regmask 0 to ensure this will not be moved by the
+    // scheduler.).
+    const uint32_t Mask = 0;
+    BuildMI(MBB, MBBI, DL, ZII->get(SystemZ::CallBRASL))
+        .addGlobalAddress(MF.getFunction().getParent()->getFunction("mcount"))
+        .addRegMask(&Mask);
+
+    // Reload return address drom 8 bytes above stack pointer.
+    BuildMI(MBB, MBBI, DL, ZII->get(SystemZ::LG))
+      .addReg(SystemZ::R14D)
+      .addReg(SystemZ::R15D)
+      .addImm(8)
+      .addReg(0);
+  }
+
   // The current offset of the stack pointer from the CFA.
   int64_t SPOffsetFromCFA = -SystemZMC::ELFCFAOffsetFromInitialSP;
 
diff --git a/llvm/lib/Transforms/Utils/EntryExitInstrumenter.cpp b/llvm/lib/Transforms/Utils/EntryExitInstrumenter.cpp
index d47f1b4253b54..72e4f8791a91a 100644
--- a/llvm/lib/Transforms/Utils/EntryExitInstrumenter.cpp
+++ b/llvm/lib/Transforms/Utils/EntryExitInstrumenter.cpp
@@ -22,7 +22,7 @@
 
 using namespace llvm;
 
-static void insertCall(Function &CurFn, StringRef Func,
+static bool insertCall(Function &CurFn, StringRef Func,
                        BasicBlock::iterator InsertionPt, DebugLoc DL) {
   Module &M = *InsertionPt->getParent()->getParent()->getParent();
   LLVMContext &C = InsertionPt->getParent()->getContext();
@@ -63,12 +63,16 @@ static void insertCall(Function &CurFn, StringRef Func,
                                   false));
       CallInst *Call = CallInst::Create(Fn, RetAddr, "", InsertionPt);
       Call->setDebugLoc(DL);
+    } else if (TargetTriple.isSystemZ()) {
+      M.getOrInsertFunction(Func, Type::getVoidTy(C));
+      // skip insertion for `mcount` on SystemZ. This will be handled later in `emitPrologue`.
+      return false;
     } else {
       FunctionCallee Fn = M.getOrInsertFunction(Func, Type::getVoidTy(C));
       CallInst *Call = CallInst::Create(Fn, "", InsertionPt);
       Call->setDebugLoc(DL);
     }
-    return;
+    return true;
   }
 
   if (Func == "__cyg_profile_func_enter" || Func == "__cyg_profile_func_exit") {
@@ -87,7 +91,7 @@ static void insertCall(Function &CurFn, StringRef Func,
     CallInst *Call =
         CallInst::Create(Fn, ArrayRef<Value *>(Args), "", InsertionPt);
     Call->setDebugLoc(DL);
-    return;
+    return true;
   }
 
   // We only know how to call a fixed set of instrumentation functions, because
@@ -129,9 +133,9 @@ static bool runOnFunction(Function &F, bool PostInlining) {
     if (auto SP = F.getSubprogram())
       DL = DILocation::get(SP->getContext(), SP->getScopeLine(), 0, SP);
 
-    insertCall(F, EntryFunc, F.begin()->getFirstInsertionPt(), DL);
-    Changed = true;
-    F.removeFnAttr(EntryAttr);
+    Changed = insertCall(F, EntryFunc, F.begin()->getFirstInsertionPt(), DL);
+    if (Changed)
+      F.removeFnAttr(EntryAttr);
   }
 
   if (!ExitFunc.empty()) {
diff --git a/llvm/test/CodeGen/SystemZ/mcount.ll b/llvm/test/CodeGen/SystemZ/mcount.ll
new file mode 100644
index 0000000000000..01bd34548f125
--- /dev/null
+++ b/llvm/test/CodeGen/SystemZ/mcount.ll
@@ -0,0 +1,35 @@
+; Test proper insertion of mcount instrumentation
+;
+; RUN: llc < %s -mtriple=s390x-linux-gnu -o - | FileCheck %s
+;
+; CHECK: # %bb.0:
+; CHECK-NEXT: stg %r14, 8(%r15)
+; CHECK-NEXT: brasl %r14, mcount@PLT
+; CHECK-NEXT: lg %r14, 8(%r15)
+define dso_local signext i32 @fib(i32 noundef signext %n) #0 {
+entry:
+  %n.addr = alloca i32, align 4
+  store i32 %n, ptr %n.addr, align 4
+  %0 = load i32, ptr %n.addr, align 4
+  %cmp = icmp sle i32 %0, 1
+  br i1 %cmp, label %cond.true, label %cond.false
+
+cond.true:                                        ; preds = %entry
+  br label %cond.end
+
+cond.false:                                       ; preds = %entry
+  %1 = load i32, ptr %n.addr, align 4
+  %sub = sub nsw i32 %1, 1
+  %call = call signext i32 @fib(i32 noundef signext %sub)
+  %2 = load i32, ptr %n.addr, align 4
+  %sub1 = sub nsw i32 %2, 2
+  %call2 = call signext i32 @fib(i32 noundef signext %sub1)
+  %add = add nsw i32 %call, %call2
+  br label %cond.end
+
+cond.end:                                         ; preds = %cond.false, %cond.true
+  %cond = phi i32 [ 1, %cond.true ], [ %add, %cond.false ]
+  ret i32 %cond
+}
+
+attributes #0 = { "instrument-function-entry-inlined"="mcount" }

Copy link

github-actions bot commented Apr 15, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@dominik-steenken
Copy link
Contributor Author

The "Linux Premerge Checks" fail seems procedural rather than related to the PR:

Error: Failed to CreateArtifact: Received non-retryable error: Failed request: (409) Conflict: an artifact with this name already exists on the workflow run

How can i request a rebuild?

@dominik-steenken dominik-steenken force-pushed the fix-mcount-v2 branch 2 times, most recently from c246c4e to d715e01 Compare April 25, 2025 09:20
@dominik-steenken
Copy link
Contributor Author

@uweigand This is ready for review now.

@dominik-steenken
Copy link
Contributor Author

dominik-steenken commented May 2, 2025

I incorporated the review comments. Will rebase and push in a few moments.

When compiling with `-pg`, the `EntryExitInstrumenterPass` will insert calls
to the glibc function `mcount` at the begining of each `MachineFunction`.

On SystemZ, these calls require special handling:

- The call to `mcount` needs to happen at the beginning of the prologue.
- Prior to the call to `mcount`, register `%r14`, the return address of the
  callee function, must be stored 8 bytes above the stack pointer `%r15`.
  After the call to `mcount` returns, that register needs to be restored.

This commit adds some special handling to the EntryExitInstrumenterPass that
keeps the insertion of the mcount function into the module, but skips over
insertion of the actual call in order to perform this insertion in the
`emitPrologue` function. There, a simple sequence of store/call/load is
inserted, which implements the above.
Copy link
Member

@uweigand uweigand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now, thanks!

@uweigand uweigand merged commit 083b4a3 into llvm:main May 2, 2025
6 of 10 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented May 2, 2025

LLVM Buildbot has detected a new failure on builder llvm-clang-x86_64-expensive-checks-debian running on gribozavr4 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/16/builds/18287

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: CodeGen/SystemZ/mcount.ll' FAILED ********************
Exit Code: 2

Command Output (stderr):
--
/b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/llc < /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/CodeGen/SystemZ/mcount.ll -mtriple=s390x-linux-gnu -o - | /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/FileCheck /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/CodeGen/SystemZ/mcount.ll # RUN: at line 3
+ /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/llc -mtriple=s390x-linux-gnu -o -
+ /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/FileCheck /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/CodeGen/SystemZ/mcount.ll

# After Prologue/Epilogue Insertion & Frame Finalization
# Machine code for function fib: NoPHIs, TracksLiveness, NoVRegs, TiedOpsRewritten, TracksDebugUserValues
Frame Objects:
  fi#-4: size=8, align=8, fixed, at location [SP-160]
  fi#-3: size=8, align=8, fixed, at location [SP-40]
  fi#-2: size=8, align=8, fixed, at location [SP-48]
  fi#-1: size=8, align=8, fixed, at location [SP-56]
  fi#0: size=4, align=4, at location [SP-164]
Function Live Ins: $r2d

bb.0.entry:
  successors: %bb.2(0x40000000), %bb.1(0x40000000); %bb.2(50.00%), %bb.1(50.00%)
  liveins: $r2d, $r13d, $r15d, $r14d
  STG $r14d, $r15d, 8, $noreg
  CallBRASL @mcount, <regmask $f0d $f1d $f2d $f3d $f4d $f5d $f6d $f7d $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f0h $f1h $f2h $f3h $f4h $f5h $f6h $f7h $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f0q and 72 more...>, implicit-def $r14d, implicit-def $cc, implicit $fpc
  LG $r14d, $r15d, 8, $noreg
  STMG killed $r13d, killed $r15d, $r15d, 104, implicit killed $r14d
  CFI_INSTRUCTION offset $r13d, -56
  CFI_INSTRUCTION offset $r14d, -48
  CFI_INSTRUCTION offset $r15d, -40
  $r15d = AGHI $r15d(tied-def 0), -168, implicit-def dead $cc
  CFI_INSTRUCTION def_cfa_offset 328
  ST renamable $r2l, $r15d, 164, $noreg :: (store (s32) into %ir.n.addr)
  CHI renamable $r2l, 2, implicit-def $cc, implicit killed $r2d
  renamable $r2l = LHI 1, implicit-def $r2d
  BRC 14, 4, %bb.2, implicit killed $cc
  J %bb.1

bb.1.cond.false:
; predecessors: %bb.0
  successors: %bb.2(0x80000000); %bb.2(100.00%)

  renamable $r0l = LHI -1
  renamable $r0l = A killed renamable $r0l(tied-def 0), $r15d, 164, $noreg, implicit-def dead $cc :: (dereferenceable load (s32) from %ir.n.addr)
  renamable $r2d = LGFR killed renamable $r0l
  CallBRASL @fib, $r2d, <regmask $f8d $f9d $f10d $f11d $f12d $f13d $f14d $f15d $f8h $f9h $f10h $f11h $f12h $f13h $f14h $f15h $f8q $f9q $f12q $f13q $f8s $f9s $f10s $f11s $f12s $f13s $f14s $f15s $r6d $r7d $r8d $r9d $r10d and 30 more...>, implicit-def dead $r14d, implicit-def dead $cc, implicit $fpc, implicit-def $r2d
  renamable $r0l = LHI -2
  renamable $r0l = A killed renamable $r0l(tied-def 0), $r15d, 164, $noreg, implicit-def dead $cc :: (dereferenceable load (s32) from %ir.n.addr)
  renamable $r0d = LGFR killed renamable $r0l
  renamable $r13d = COPY $r2d
  $r2d = COPY killed renamable $r0d
...

@dominik-steenken
Copy link
Contributor Author

checking...

dominik-steenken added a commit to dominik-steenken/llvm-project that referenced this pull request May 2, 2025
Commit `083b4a3d66` introduced a store-and-load pair around the `BRASL` call to
mcount. That load instruction did not properly declare it's target register as
defined, leading to a bad machine instruction.

This commit fixes this by explicitly labeling `%r14` on the load as `def`.
dominik-steenken added a commit to dominik-steenken/llvm-project that referenced this pull request May 2, 2025
Commit `083b4a3d66` introduced a store-and-load pair around the `BRASL` call to
mcount. That load instruction did not properly declare its target register as
defined, leading to a bad machine instruction.

This commit fixes this by explicitly labeling `%r14` on the load as `def`.
dominik-steenken added a commit to dominik-steenken/llvm-project that referenced this pull request May 2, 2025
Commit `083b4a3d66` introduced a store-and-load pair around the `BRASL` call to
mcount. That load instruction did not properly declare its target register as
defined, leading to a bad machine instruction.

This commit fixes this by explicitly labeling `%r14` on the load as `def`.
uweigand pushed a commit that referenced this pull request May 2, 2025
Commit `083b4a3d66` introduced a store-and-load pair around the `BRASL`
call to mcount. That load instruction did not properly declare its
target register as defined, leading to a bad machine instruction.

This commit fixes this by explicitly labeling `%r14` on the load as
`def`.
@RKSimon
Copy link
Collaborator

RKSimon commented May 2, 2025

@dominik-steenken I'm still seeing an EXPENSIVE_CHECKS build failure - any luck tracking it down please?

@dominik-steenken
Copy link
Contributor Author

Will look into it shortly

@dominik-steenken
Copy link
Contributor Author

@RKSimon I'm not exactly sure what i'm looking at with this sanitizer test fail. I'm currently trying to reproduce it locally. However, looking at that buildbot, it seems the next build is fine, so i'm cautiously optimistic that this was some sort of transient failure.

@RKSimon
Copy link
Collaborator

RKSimon commented May 2, 2025

I've only seen this on an EXPENSIVE_CHECKS build so far. You need to enable the LLVM_ENABLE_EXPENSIVE_CHECKS cmake def

IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
When compiling with `-pg`, the `EntryExitInstrumenterPass` will insert
calls to the glibc function `mcount` at the begining of each
`MachineFunction`.

On SystemZ, these calls require special handling:

- The call to `mcount` needs to happen at the beginning of the prologue.
- Prior to the call to `mcount`, register `%r14`, the return address of
the callee function, must be stored 8 bytes above the stack pointer
`%r15`. After the call to `mcount` returns, that register needs to be
restored.

This commit adds some special handling to the EntryExitInstrumenterPass
that keeps the insertion of the mcount function into the module, but
skips over insertion of the actual call in order to perform this
insertion in the `emitPrologue` function. There, a simple sequence of
store/call/load is inserted, which implements the above.

The desired change in the `EntryExitInstrumenterPass` necessitated the
addition of a new attribute and attribute kind to each function, which
is used to trigger the postprocessing, aka call insertion, in
`emitPrologue`. Note that the new attribute must be of a different kind
than the `mcount` atribute, since otherwise it would replace that
attribute and later be deleted by the code that intended to delete
`mcount`. The new attribnute is called `insert-mcount`, while the
attribute kind is `systemz-backend`, to clearly mark it as a
SystemZ-specific backend concern.

This PR should address issue llvm#121137 . The test inserted here is derived
from the example given in that issue.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
Commit `083b4a3d66` introduced a store-and-load pair around the `BRASL`
call to mcount. That load instruction did not properly declare its
target register as defined, leading to a bad machine instruction.

This commit fixes this by explicitly labeling `%r14` on the load as
`def`.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
When compiling with `-pg`, the `EntryExitInstrumenterPass` will insert
calls to the glibc function `mcount` at the begining of each
`MachineFunction`.

On SystemZ, these calls require special handling:

- The call to `mcount` needs to happen at the beginning of the prologue.
- Prior to the call to `mcount`, register `%r14`, the return address of
the callee function, must be stored 8 bytes above the stack pointer
`%r15`. After the call to `mcount` returns, that register needs to be
restored.

This commit adds some special handling to the EntryExitInstrumenterPass
that keeps the insertion of the mcount function into the module, but
skips over insertion of the actual call in order to perform this
insertion in the `emitPrologue` function. There, a simple sequence of
store/call/load is inserted, which implements the above.

The desired change in the `EntryExitInstrumenterPass` necessitated the
addition of a new attribute and attribute kind to each function, which
is used to trigger the postprocessing, aka call insertion, in
`emitPrologue`. Note that the new attribute must be of a different kind
than the `mcount` atribute, since otherwise it would replace that
attribute and later be deleted by the code that intended to delete
`mcount`. The new attribnute is called `insert-mcount`, while the
attribute kind is `systemz-backend`, to clearly mark it as a
SystemZ-specific backend concern.

This PR should address issue llvm#121137 . The test inserted here is derived
from the example given in that issue.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
Commit `083b4a3d66` introduced a store-and-load pair around the `BRASL`
call to mcount. That load instruction did not properly declare its
target register as defined, leading to a bad machine instruction.

This commit fixes this by explicitly labeling `%r14` on the load as
`def`.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
When compiling with `-pg`, the `EntryExitInstrumenterPass` will insert
calls to the glibc function `mcount` at the begining of each
`MachineFunction`.

On SystemZ, these calls require special handling:

- The call to `mcount` needs to happen at the beginning of the prologue.
- Prior to the call to `mcount`, register `%r14`, the return address of
the callee function, must be stored 8 bytes above the stack pointer
`%r15`. After the call to `mcount` returns, that register needs to be
restored.

This commit adds some special handling to the EntryExitInstrumenterPass
that keeps the insertion of the mcount function into the module, but
skips over insertion of the actual call in order to perform this
insertion in the `emitPrologue` function. There, a simple sequence of
store/call/load is inserted, which implements the above.

The desired change in the `EntryExitInstrumenterPass` necessitated the
addition of a new attribute and attribute kind to each function, which
is used to trigger the postprocessing, aka call insertion, in
`emitPrologue`. Note that the new attribute must be of a different kind
than the `mcount` atribute, since otherwise it would replace that
attribute and later be deleted by the code that intended to delete
`mcount`. The new attribnute is called `insert-mcount`, while the
attribute kind is `systemz-backend`, to clearly mark it as a
SystemZ-specific backend concern.

This PR should address issue llvm#121137 . The test inserted here is derived
from the example given in that issue.
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
Commit `083b4a3d66` introduced a store-and-load pair around the `BRASL`
call to mcount. That load instruction did not properly declare its
target register as defined, leading to a bad machine instruction.

This commit fixes this by explicitly labeling `%r14` on the load as
`def`.
GeorgeARM pushed a commit to GeorgeARM/llvm-project that referenced this pull request May 7, 2025
When compiling with `-pg`, the `EntryExitInstrumenterPass` will insert
calls to the glibc function `mcount` at the begining of each
`MachineFunction`.

On SystemZ, these calls require special handling:

- The call to `mcount` needs to happen at the beginning of the prologue.
- Prior to the call to `mcount`, register `%r14`, the return address of
the callee function, must be stored 8 bytes above the stack pointer
`%r15`. After the call to `mcount` returns, that register needs to be
restored.

This commit adds some special handling to the EntryExitInstrumenterPass
that keeps the insertion of the mcount function into the module, but
skips over insertion of the actual call in order to perform this
insertion in the `emitPrologue` function. There, a simple sequence of
store/call/load is inserted, which implements the above.

The desired change in the `EntryExitInstrumenterPass` necessitated the
addition of a new attribute and attribute kind to each function, which
is used to trigger the postprocessing, aka call insertion, in
`emitPrologue`. Note that the new attribute must be of a different kind
than the `mcount` atribute, since otherwise it would replace that
attribute and later be deleted by the code that intended to delete
`mcount`. The new attribnute is called `insert-mcount`, while the
attribute kind is `systemz-backend`, to clearly mark it as a
SystemZ-specific backend concern.

This PR should address issue llvm#121137 . The test inserted here is derived
from the example given in that issue.
GeorgeARM pushed a commit to GeorgeARM/llvm-project that referenced this pull request May 7, 2025
Commit `083b4a3d66` introduced a store-and-load pair around the `BRASL`
call to mcount. That load instruction did not properly declare its
target register as defined, leading to a bad machine instruction.

This commit fixes this by explicitly labeling `%r14` on the load as
`def`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants