Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 52c583b

Browse files
[SampleFDO][TypeProf]Support vtable type profiling for ext-binary and text format (llvm#148002)
This change extends SampleFDO ext-binary and text format to record the vtable symbols and their counts for virtual calls inside a function. The vtable profiles will allow the compiler to annotate vtable types on IR instructions and perform vtable-based indirect call promotion. An RFC is in https://discourse.llvm.org/t/rfc-vtable-type-profiling-for-samplefdo/87283 Given a function below, the before vs after of a function's profile is illustrated in text format in the table: ``` __attribute__((noinline)) int loop_func(int i, int a, int b) { Base *ptr = createType(i); int sum = ptr->func(a, b); delete ptr; return sum; } ``` | before | after | | --- | --- | | Samples collected in the function's body { <br> 0: 636241 <br> 1: 681458, calls: _Z10createTypei:681458 <br> 3: 543499, calls: _ZN12_GLOBAL__N_18Derived24funcEii:410621 _ZN8Derived14funcEii:132878 <br> 5.1: 602201, calls: _ZN12_GLOBAL__N_18Derived2D0Ev:454635 _ZN8Derived1D0Ev:147566 <br> 7: 511057 <br> } | Samples collected in the function's body { <br> 0: 636241 <br> 1: 681458, calls: _Z10createTypei:681458 <br> 3: 543499, calls: _ZN12_GLOBAL__N_18Derived24funcEii:410621 _ZN8Derived14funcEii:132878 <br> 3: vtables: _ZTV8Derived1:1377 _ZTVN12_GLOBAL__N_18Derived2E:4250 <br> 5.1: 602201, calls: _ZN12_GLOBAL__N_18Derived2D0Ev:454635 _ZN8Derived1D0Ev:147566 <br> 5.1: vtables: _ZTV8Derived1:227 _ZTVN12_GLOBAL__N_18Derived2E:765 <br> 7: 511057 <br> } | Key points for this change: 1. In-memory representation of vtable profiles * A field of type `map<LineLocation, map<FunctionId, uint64_t>>` is introduced in a function's in-memory representation [FunctionSamples](https://github.com/llvm/llvm-project/blob/ccc416312ed72e92a885425d9cb9c01f9afa58eb/llvm/include/llvm/ProfileData/SampleProf.h#L749-L754). 2. The vtable counters for one LineLocation represents the relative frequency among vtables for this LineLocation. They are not required to be comparable across LineLocations. 3. For backward compatibility of ext-binary format, we take one bit from ProfSummaryFlag as illustrated in the enum class `SecProfSummaryFlags`. The ext-binary profile reader parses the integer type flag and reads this bit. If it's set, the profile reader will parse vtable profiles. 4. The vtable profiles are optional in ext-binary format, and not serialized out by default, we introduce an LLVM boolean option (named `-extbinary-write-vtable-type-prof`). The ext-binary profile writer reads the boolean option and decide whether to set the section flag bit and serialize the in-memory class members corresponding to vtables. 5. This change doesn't implement `llvm-profdata overlap --sample` for the vtable profiles. A subsequent change will do it to keep this one focused on the profile format change. We don't plan to add the vtable support to non-extensible format mainly because of the maintenance cost to keep backward compatibility for prior versions of profile data. * Currently, the [non-extensible binary format](https://github.com/llvm/llvm-project/blob/5c28af409978c19a35021855a29dcaa65e95da00/llvm/lib/ProfileData/SampleProfWriter.cpp#L899-L900) does not have feature parity with extensible binary format today, for instance, the former doesn't support [profile symbol list](https://github.com/llvm/llvm-project/blob/41e22aa31b1905aa3e9d83c0343a96ec0d5187ec/llvm/include/llvm/ProfileData/SampleProf.h#L1518-L1522) or context-sensitive PGO, both of which give measurable performance boost. Presumably the non-extensible format is not in wide use. --------- Co-authored-by: Paschalis Mpeis <[email protected]>
1 parent 1131e44 commit 52c583b

File tree

11 files changed

+436
-17
lines changed

11 files changed

+436
-17
lines changed

llvm/include/llvm/ProfileData/SampleProf.h

Lines changed: 95 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,7 @@ enum class sampleprof_error {
6262
uncompress_failed,
6363
zlib_unavailable,
6464
hash_mismatch,
65-
illegal_line_offset
65+
illegal_line_offset,
6666
};
6767

6868
inline std::error_code make_error_code(sampleprof_error E) {
@@ -91,6 +91,8 @@ struct is_error_code_enum<llvm::sampleprof_error> : std::true_type {};
9191
namespace llvm {
9292
namespace sampleprof {
9393

94+
constexpr char kVTableProfPrefix[] = "vtables ";
95+
9496
enum SampleProfileFormat {
9597
SPF_None = 0,
9698
SPF_Text = 0x1,
@@ -204,6 +206,9 @@ enum class SecProfSummaryFlags : uint32_t {
204206
/// SecFlagIsPreInlined means this profile contains ShouldBeInlined
205207
/// contexts thus this is CS preinliner computed.
206208
SecFlagIsPreInlined = (1 << 4),
209+
210+
/// SecFlagHasVTableTypeProf means this profile contains vtable type profiles.
211+
SecFlagHasVTableTypeProf = (1 << 5),
207212
};
208213

209214
enum class SecFuncMetadataFlags : uint32_t {
@@ -303,7 +308,7 @@ struct LineLocation {
303308
}
304309

305310
uint64_t getHashCode() const {
306-
return ((uint64_t) Discriminator << 32) | LineOffset;
311+
return ((uint64_t)Discriminator << 32) | LineOffset;
307312
}
308313

309314
uint32_t LineOffset;
@@ -318,16 +323,30 @@ struct LineLocationHash {
318323

319324
LLVM_ABI raw_ostream &operator<<(raw_ostream &OS, const LineLocation &Loc);
320325

326+
/// Key represents type of a C++ polymorphic class type by its vtable and value
327+
/// represents its counter.
328+
/// TODO: The class name FunctionId should be renamed to SymbolId in a refactor
329+
/// change.
330+
using TypeCountMap = std::map<FunctionId, uint64_t>;
331+
332+
/// Write \p Map to the output stream. Keys are linearized using \p NameTable
333+
/// and written as ULEB128. Values are written as ULEB128 as well.
334+
std::error_code
335+
serializeTypeMap(const TypeCountMap &Map,
336+
const MapVector<FunctionId, uint32_t> &NameTable,
337+
raw_ostream &OS);
338+
321339
/// Representation of a single sample record.
322340
///
323341
/// A sample record is represented by a positive integer value, which
324342
/// indicates how frequently was the associated line location executed.
325343
///
326344
/// Additionally, if the associated location contains a function call,
327-
/// the record will hold a list of all the possible called targets. For
328-
/// direct calls, this will be the exact function being invoked. For
329-
/// indirect calls (function pointers, virtual table dispatch), this
330-
/// will be a list of one or more functions.
345+
/// the record will hold a list of all the possible called targets and the types
346+
/// for virtual table dispatches. For direct calls, this will be the exact
347+
/// function being invoked. For indirect calls (function pointers, virtual table
348+
/// dispatch), this will be a list of one or more functions. For virtual table
349+
/// dispatches, this record will also hold the type of the object.
331350
class SampleRecord {
332351
public:
333352
using CallTarget = std::pair<FunctionId, uint64_t>;
@@ -746,6 +765,7 @@ using BodySampleMap = std::map<LineLocation, SampleRecord>;
746765
// memory, which is *very* significant for large profiles.
747766
using FunctionSamplesMap = std::map<FunctionId, FunctionSamples>;
748767
using CallsiteSampleMap = std::map<LineLocation, FunctionSamplesMap>;
768+
using CallsiteTypeMap = std::map<LineLocation, TypeCountMap>;
749769
using LocToLocMap =
750770
std::unordered_map<LineLocation, LineLocation, LineLocationHash>;
751771

@@ -939,6 +959,14 @@ class FunctionSamples {
939959
return &Iter->second;
940960
}
941961

962+
/// Returns the TypeCountMap for inlined callsites at the given \p Loc.
963+
const TypeCountMap *findCallsiteTypeSamplesAt(const LineLocation &Loc) const {
964+
auto Iter = VirtualCallsiteTypeCounts.find(mapIRLocToProfileLoc(Loc));
965+
if (Iter == VirtualCallsiteTypeCounts.end())
966+
return nullptr;
967+
return &Iter->second;
968+
}
969+
942970
/// Returns a pointer to FunctionSamples at the given callsite location
943971
/// \p Loc with callee \p CalleeName. If no callsite can be found, relax
944972
/// the restriction to return the FunctionSamples at callsite location
@@ -1000,6 +1028,46 @@ class FunctionSamples {
10001028
return CallsiteSamples;
10011029
}
10021030

1031+
/// Returns vtable access samples for the C++ types collected in this
1032+
/// function.
1033+
const CallsiteTypeMap &getCallsiteTypeCounts() const {
1034+
return VirtualCallsiteTypeCounts;
1035+
}
1036+
1037+
/// Returns the vtable access samples for the C++ types for \p Loc.
1038+
/// Under the hood, the caller-specified \p Loc will be un-drifted before the
1039+
/// type sample lookup if possible.
1040+
TypeCountMap &getTypeSamplesAt(const LineLocation &Loc) {
1041+
return VirtualCallsiteTypeCounts[mapIRLocToProfileLoc(Loc)];
1042+
}
1043+
1044+
/// Scale \p Other sample counts by \p Weight and add the scaled result to the
1045+
/// type samples for \p Loc. Under the hoold, the caller-provided \p Loc will
1046+
/// be un-drifted before the type sample lookup if possible.
1047+
/// typename T is either a std::map or a DenseMap.
1048+
template <typename T>
1049+
sampleprof_error addCallsiteVTableTypeProfAt(const LineLocation &Loc,
1050+
const T &Other,
1051+
uint64_t Weight = 1) {
1052+
static_assert((std::is_same_v<typename T::key_type, StringRef> ||
1053+
std::is_same_v<typename T::key_type, FunctionId>) &&
1054+
std::is_same_v<typename T::mapped_type, uint64_t>,
1055+
"T must be a map with StringRef or FunctionId as key and "
1056+
"uint64_t as value");
1057+
TypeCountMap &TypeCounts = getTypeSamplesAt(Loc);
1058+
bool Overflowed = false;
1059+
1060+
for (const auto [Type, Count] : Other) {
1061+
FunctionId TypeId(Type);
1062+
bool RowOverflow = false;
1063+
TypeCounts[TypeId] = SaturatingMultiplyAdd(
1064+
Count, Weight, TypeCounts[TypeId], &RowOverflow);
1065+
Overflowed |= RowOverflow;
1066+
}
1067+
return Overflowed ? sampleprof_error::counter_overflow
1068+
: sampleprof_error::success;
1069+
}
1070+
10031071
/// Return the maximum of sample counts in a function body. When SkipCallSite
10041072
/// is false, which is the default, the return count includes samples in the
10051073
/// inlined functions. When SkipCallSite is true, the return count only
@@ -1054,6 +1122,10 @@ class FunctionSamples {
10541122
mergeSampleProfErrors(Result,
10551123
FSMap[Rec.first].merge(Rec.second, Weight));
10561124
}
1125+
for (const auto &[Loc, OtherTypeMap] : Other.getCallsiteTypeCounts())
1126+
mergeSampleProfErrors(
1127+
Result, addCallsiteVTableTypeProfAt(Loc, OtherTypeMap, Weight));
1128+
10571129
return Result;
10581130
}
10591131

@@ -1297,6 +1369,23 @@ class FunctionSamples {
12971369
/// collected in the call to baz() at line offset 8.
12981370
CallsiteSampleMap CallsiteSamples;
12991371

1372+
/// Map a virtual callsite to the list of accessed vtables and vtable counts.
1373+
/// The callsite is referenced by its source location.
1374+
///
1375+
/// For example, given:
1376+
///
1377+
/// void foo() {
1378+
/// ...
1379+
/// 5 inlined_vcall_bar();
1380+
/// ...
1381+
/// 5 inlined_vcall_baz();
1382+
/// ...
1383+
/// 200 inlined_vcall_qux();
1384+
/// }
1385+
/// This map will contain two entries. One with two types for line offset 5
1386+
/// and one with one type for line offset 200.
1387+
CallsiteTypeMap VirtualCallsiteTypeCounts;
1388+
13001389
/// IR to profile location map generated by stale profile matching.
13011390
///
13021391
/// Each entry is a mapping from the location on current build to the matched

llvm/include/llvm/ProfileData/SampleProfReader.h

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -589,6 +589,10 @@ class SampleProfileReader {
589589
/// Whether the function profiles use FS discriminators.
590590
bool ProfileIsFS = false;
591591

592+
/// If true, the profile has vtable profiles and reader should decode them
593+
/// to parse profiles correctly.
594+
bool ReadVTableProf = false;
595+
592596
/// \brief The format of sample.
593597
SampleProfileFormat Format = SPF_None;
594598

@@ -703,6 +707,14 @@ class LLVM_ABI SampleProfileReaderBinary : public SampleProfileReader {
703707
/// otherwise same as readStringFromTable, also return its hash value.
704708
ErrorOr<std::pair<SampleContext, uint64_t>> readSampleContextFromTable();
705709

710+
/// Read all virtual functions' vtable access counts for \p FProfile.
711+
std::error_code readCallsiteVTableProf(FunctionSamples &FProfile);
712+
713+
/// Read bytes from the input buffer pointed by `Data` and decode them into
714+
/// \p M. `Data` will be advanced to the end of the read bytes when this
715+
/// function returns. Returns error if any.
716+
std::error_code readVTableTypeCountMap(TypeCountMap &M);
717+
706718
/// Points to the current location in the buffer.
707719
const uint8_t *Data = nullptr;
708720

llvm/include/llvm/ProfileData/SampleProfWriter.h

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -217,13 +217,20 @@ class LLVM_ABI SampleProfileWriterBinary : public SampleProfileWriter {
217217
std::error_code writeBody(const FunctionSamples &S);
218218
inline void stablizeNameTable(MapVector<FunctionId, uint32_t> &NameTable,
219219
std::set<FunctionId> &V);
220-
220+
221221
MapVector<FunctionId, uint32_t> NameTable;
222-
222+
223223
void addName(FunctionId FName);
224224
virtual void addContext(const SampleContext &Context);
225225
void addNames(const FunctionSamples &S);
226226

227+
/// Write \p CallsiteTypeMap to the output stream \p OS.
228+
std::error_code
229+
writeCallsiteVTableProf(const CallsiteTypeMap &CallsiteTypeMap,
230+
raw_ostream &OS);
231+
232+
bool WriteVTableProf = false;
233+
227234
private:
228235
LLVM_ABI friend ErrorOr<std::unique_ptr<SampleProfileWriter>>
229236
SampleProfileWriter::create(std::unique_ptr<raw_ostream> &OS,
@@ -412,8 +419,7 @@ class LLVM_ABI SampleProfileWriterExtBinaryBase
412419
class LLVM_ABI SampleProfileWriterExtBinary
413420
: public SampleProfileWriterExtBinaryBase {
414421
public:
415-
SampleProfileWriterExtBinary(std::unique_ptr<raw_ostream> &OS)
416-
: SampleProfileWriterExtBinaryBase(OS) {}
422+
SampleProfileWriterExtBinary(std::unique_ptr<raw_ostream> &OS);
417423

418424
private:
419425
std::error_code writeDefaultLayout(const SampleProfileMap &ProfileMap);

llvm/lib/ProfileData/SampleProf.cpp

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,24 @@ bool FunctionSamples::ProfileIsPreInlined = false;
4747
bool FunctionSamples::UseMD5 = false;
4848
bool FunctionSamples::HasUniqSuffix = true;
4949
bool FunctionSamples::ProfileIsFS = false;
50+
51+
std::error_code
52+
serializeTypeMap(const TypeCountMap &Map,
53+
const MapVector<FunctionId, uint32_t> &NameTable,
54+
raw_ostream &OS) {
55+
encodeULEB128(Map.size(), OS);
56+
for (const auto &[TypeName, SampleCount] : Map) {
57+
if (auto NameIndexIter = NameTable.find(TypeName);
58+
NameIndexIter != NameTable.end()) {
59+
encodeULEB128(NameIndexIter->second, OS);
60+
} else {
61+
// If the type is not in the name table, we cannot serialize it.
62+
return sampleprof_error::truncated_name_table;
63+
}
64+
encodeULEB128(SampleCount, OS);
65+
}
66+
return sampleprof_error::success;
67+
}
5068
} // namespace sampleprof
5169
} // namespace llvm
5270

@@ -178,6 +196,17 @@ raw_ostream &llvm::sampleprof::operator<<(raw_ostream &OS,
178196
return OS;
179197
}
180198

199+
static void printTypeCountMap(raw_ostream &OS, LineLocation Loc,
200+
const TypeCountMap &TypeCountMap) {
201+
if (TypeCountMap.empty()) {
202+
return;
203+
}
204+
OS << Loc << ": vtables: ";
205+
for (const auto &[Type, Count] : TypeCountMap)
206+
OS << Type << ":" << Count << " ";
207+
OS << "\n";
208+
}
209+
181210
/// Print the samples collected for a function on stream \p OS.
182211
void FunctionSamples::print(raw_ostream &OS, unsigned Indent) const {
183212
if (getFunctionHash())
@@ -192,7 +221,13 @@ void FunctionSamples::print(raw_ostream &OS, unsigned Indent) const {
192221
SampleSorter<LineLocation, SampleRecord> SortedBodySamples(BodySamples);
193222
for (const auto &SI : SortedBodySamples.get()) {
194223
OS.indent(Indent + 2);
224+
const auto &Loc = SI->first;
195225
OS << SI->first << ": " << SI->second;
226+
if (const TypeCountMap *TypeCountMap =
227+
this->findCallsiteTypeSamplesAt(Loc)) {
228+
OS.indent(Indent + 2);
229+
printTypeCountMap(OS, Loc, *TypeCountMap);
230+
}
196231
}
197232
OS.indent(Indent);
198233
OS << "}\n";
@@ -214,6 +249,11 @@ void FunctionSamples::print(raw_ostream &OS, unsigned Indent) const {
214249
OS << Loc << ": inlined callee: " << FuncSample.getFunction() << ": ";
215250
FuncSample.print(OS, Indent + 4);
216251
}
252+
auto TypeSamplesIter = VirtualCallsiteTypeCounts.find(Loc);
253+
if (TypeSamplesIter != VirtualCallsiteTypeCounts.end()) {
254+
OS.indent(Indent + 2);
255+
printTypeCountMap(OS, Loc, TypeSamplesIter->second);
256+
}
217257
}
218258
OS.indent(Indent);
219259
OS << "}\n";

0 commit comments

Comments
 (0)