[memprof] Introduce handleCallSite (NFC)#149724
Merged
kazutakahirata merged 1 commit intollvm:mainfrom Jul 21, 2025
Merged
Conversation
Continuing the effort to refactor readMemProf, this patch introduces handlCallSite to handle, well, call sites. Moving the code requires taking CallSiteEntry and CallSiteEntryHash out of readMemProf. We could simplify some code, but I'm keeping this patch very simple to facilitate the review process. For example, we could simplify the control flow near the end of readMemProf, but we can address that later.
Member
|
@llvm/pr-subscribers-llvm-transforms Author: Kazu Hirata (kazutakahirata) ChangesContinuing the effort to refactor readMemProf, this patch introduces Moving the code requires taking CallSiteEntry and CallSiteEntryHash We could simplify some code, but I'm keeping this patch very simple to Full diff: https://github.com/llvm/llvm-project/pull/149724.diff 1 Files Affected:
diff --git a/llvm/lib/Transforms/Instrumentation/MemProfUse.cpp b/llvm/lib/Transforms/Instrumentation/MemProfUse.cpp
index 2a8416d02ffba..6e57b99c3233f 100644
--- a/llvm/lib/Transforms/Instrumentation/MemProfUse.cpp
+++ b/llvm/lib/Transforms/Instrumentation/MemProfUse.cpp
@@ -429,6 +429,63 @@ handleAllocSite(Instruction &I, CallBase *CI,
}
}
+// Helper struct for maintaining refs to callsite data. As an alternative we
+// could store a pointer to the CallSiteInfo struct but we also need the frame
+// index. Using ArrayRefs instead makes it a little easier to read.
+struct CallSiteEntry {
+ // Subset of frames for the corresponding CallSiteInfo.
+ ArrayRef<Frame> Frames;
+ // Potential targets for indirect calls.
+ ArrayRef<GlobalValue::GUID> CalleeGuids;
+
+ // Only compare Frame contents.
+ // Use pointer-based equality instead of ArrayRef's operator== which does
+ // element-wise comparison. We want to check if it's the same slice of the
+ // underlying array, not just equivalent content.
+ bool operator==(const CallSiteEntry &Other) const {
+ return Frames.data() == Other.Frames.data() &&
+ Frames.size() == Other.Frames.size();
+ }
+};
+
+struct CallSiteEntryHash {
+ size_t operator()(const CallSiteEntry &Entry) const {
+ return computeFullStackId(Entry.Frames);
+ }
+};
+
+static void handleCallSite(
+ Instruction &I, const Function *CalledFunction,
+ ArrayRef<uint64_t> InlinedCallStack,
+ const std::unordered_set<CallSiteEntry, CallSiteEntryHash> &CallSiteEntries,
+ Module &M, std::set<std::vector<uint64_t>> &MatchedCallSites) {
+ auto &Ctx = M.getContext();
+ for (const auto &CallSiteEntry : CallSiteEntries) {
+ // If we found and thus matched all frames on the call, create and
+ // attach call stack metadata.
+ if (stackFrameIncludesInlinedCallStack(CallSiteEntry.Frames,
+ InlinedCallStack)) {
+ NumOfMemProfMatchedCallSites++;
+ addCallsiteMetadata(I, InlinedCallStack, Ctx);
+
+ // Try to attach indirect call metadata if possible.
+ if (!CalledFunction)
+ addVPMetadata(M, I, CallSiteEntry.CalleeGuids);
+
+ // Only need to find one with a matching call stack and add a single
+ // callsite metadata.
+
+ // Accumulate call site matching information upon request.
+ if (ClPrintMemProfMatchInfo) {
+ std::vector<uint64_t> CallStack;
+ append_range(CallStack, InlinedCallStack);
+ MatchedCallSites.insert(std::move(CallStack));
+ }
+ break;
+ }
+ }
+}
+
static void readMemprof(Module &M, Function &F,
IndexedInstrProfReader *MemProfReader,
const TargetLibraryInfo &TLI,
@@ -499,31 +556,6 @@ static void readMemprof(Module &M, Function &F,
// (allocation info and the callsites).
std::map<uint64_t, std::set<const AllocationInfo *>> LocHashToAllocInfo;
- // Helper struct for maintaining refs to callsite data. As an alternative we
- // could store a pointer to the CallSiteInfo struct but we also need the frame
- // index. Using ArrayRefs instead makes it a little easier to read.
- struct CallSiteEntry {
- // Subset of frames for the corresponding CallSiteInfo.
- ArrayRef<Frame> Frames;
- // Potential targets for indirect calls.
- ArrayRef<GlobalValue::GUID> CalleeGuids;
-
- // Only compare Frame contents.
- // Use pointer-based equality instead of ArrayRef's operator== which does
- // element-wise comparison. We want to check if it's the same slice of the
- // underlying array, not just equivalent content.
- bool operator==(const CallSiteEntry &Other) const {
- return Frames.data() == Other.Frames.data() &&
- Frames.size() == Other.Frames.size();
- }
- };
-
- struct CallSiteEntryHash {
- size_t operator()(const CallSiteEntry &Entry) const {
- return computeFullStackId(Entry.Frames);
- }
- };
-
// For the callsites we need to record slices of the frame array (see comments
// below where the map entries are added) along with their CalleeGuids.
std::map<uint64_t, std::unordered_set<CallSiteEntry, CallSiteEntryHash>>
@@ -633,30 +665,8 @@ static void readMemprof(Module &M, Function &F,
// Otherwise, add callsite metadata. If we reach here then we found the
// instruction's leaf location in the callsites map and not the allocation
// map.
- for (const auto &CallSiteEntry : CallSitesIter->second) {
- // If we found and thus matched all frames on the call, create and
- // attach call stack metadata.
- if (stackFrameIncludesInlinedCallStack(CallSiteEntry.Frames,
- InlinedCallStack)) {
- NumOfMemProfMatchedCallSites++;
- addCallsiteMetadata(I, InlinedCallStack, Ctx);
-
- // Try to attach indirect call metadata if possible.
- if (!CalledFunction)
- addVPMetadata(M, I, CallSiteEntry.CalleeGuids);
-
- // Only need to find one with a matching call stack and add a single
- // callsite metadata.
-
- // Accumulate call site matching information upon request.
- if (ClPrintMemProfMatchInfo) {
- std::vector<uint64_t> CallStack;
- append_range(CallStack, InlinedCallStack);
- MatchedCallSites.insert(std::move(CallStack));
- }
- break;
- }
- }
+ handleCallSite(I, CalledFunction, InlinedCallStack, CallSitesIter->second,
+ M, MatchedCallSites);
}
}
}
|
snehasish
approved these changes
Jul 20, 2025
This was referenced Jul 23, 2025
mahesh-attarde
pushed a commit
to mahesh-attarde/llvm-project
that referenced
this pull request
Jul 28, 2025
Continuing the effort to refactor readMemProf, this patch introduces handlCallSite to handle, well, call sites. Moving the code requires taking CallSiteEntry and CallSiteEntryHash out of readMemProf. We could simplify some code, but I'm keeping this patch very simple to facilitate the review process. For example, we could simplify the control flow near the end of readMemProf, but we can address that later.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Continuing the effort to refactor readMemProf, this patch introduces
handlCallSite to handle, well, call sites.
Moving the code requires taking CallSiteEntry and CallSiteEntryHash
out of readMemProf.
We could simplify some code, but I'm keeping this patch very simple to
facilitate the review process. For example, we could simplify the
control flow near the end of readMemProf, but we can address that
later.