-
Notifications
You must be signed in to change notification settings - Fork 13.4k
Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler #136098
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler #136098
Conversation
This patch implements IR-based Profile-Guided Optimization support in Flang through the following flags: -fprofile-generate for instrumentation-based profile generation -fprofile-use=<dir>/file for profile-guided optimization Co-Authored-By: ict-ql <[email protected]>
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-clang-codegen @llvm/pr-subscribers-flang-driver Author: FYK (fanju110) ChangesThis patch implements IR-based Profile-Guided Optimization support in Flang through the following flags:
Resolves #74216 (implements IR PGO support phase) Key changes:
Tests:
For details on LLVM's PGO framework, refer to Clang PGO Documentation. This implementation was developed by XSCC Compiler Team. Full diff: https://github.com/llvm/llvm-project/pull/136098.diff 10 Files Affected:
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index affc076a876ad..0b0dbc467c1e0 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1756,7 +1756,7 @@ def fmcdc_max_test_vectors_EQ : Joined<["-"], "fmcdc-max-test-vectors=">,
HelpText<"Maximum number of test vectors in MC/DC coverage">,
MarshallingInfoInt<CodeGenOpts<"MCDCMaxTVs">, "0x7FFFFFFE">;
def fprofile_generate : Flag<["-"], "fprofile-generate">,
- Group<f_Group>, Visibility<[ClangOption, CLOption]>,
+ Group<f_Group>, Visibility<[ClangOption, CLOption, FlangOption, FC1Option]>,
HelpText<"Generate instrumented code to collect execution counts into default.profraw (overridden by LLVM_PROFILE_FILE env var)">;
def fprofile_generate_EQ : Joined<["-"], "fprofile-generate=">,
Group<f_Group>, Visibility<[ClangOption, CLOption]>,
@@ -1773,7 +1773,7 @@ def fprofile_use : Flag<["-"], "fprofile-use">, Group<f_Group>,
Visibility<[ClangOption, CLOption]>, Alias<fprofile_instr_use>;
def fprofile_use_EQ : Joined<["-"], "fprofile-use=">,
Group<f_Group>,
- Visibility<[ClangOption, CLOption]>,
+ Visibility<[ClangOption, CLOption, FlangOption, FC1Option]>,
MetaVarName<"<pathname>">,
HelpText<"Use instrumentation data for profile-guided optimization. If pathname is a directory, it reads from <pathname>/default.profdata. Otherwise, it reads from file <pathname>.">;
def fno_profile_instr_generate : Flag<["-"], "fno-profile-instr-generate">,
diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp
index a8b4688aed09c..fcdbe8a6aba5a 100644
--- a/clang/lib/Driver/ToolChains/Flang.cpp
+++ b/clang/lib/Driver/ToolChains/Flang.cpp
@@ -882,6 +882,14 @@ void Flang::ConstructJob(Compilation &C, const JobAction &JA,
// TODO: Handle interactions between -w, -pedantic, -Wall, -WOption
Args.AddLastArg(CmdArgs, options::OPT_w);
+
+ if (Args.hasArg(options::OPT_fprofile_generate)){
+ CmdArgs.push_back("-fprofile-generate");
+ }
+ if (const Arg *A = Args.getLastArg(options::OPT_fprofile_use_EQ)) {
+ CmdArgs.push_back(Args.MakeArgString(std::string("-fprofile-use=") + A->getValue()));
+ }
+
// Forward flags for OpenMP. We don't do this if the current action is an
// device offloading action other than OpenMP.
if (Args.hasFlag(options::OPT_fopenmp, options::OPT_fopenmp_EQ,
diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def
index 57830bf51a1b3..4dec86cd8f51b 100644
--- a/flang/include/flang/Frontend/CodeGenOptions.def
+++ b/flang/include/flang/Frontend/CodeGenOptions.def
@@ -24,6 +24,11 @@ CODEGENOPT(OptimizationLevel, 2, 0) ///< The -O[0-3] option specified.
CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new
///< pass manager.
+ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone)
+ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone)
+/// Whether emit extra debug info for sample pgo profile collection.
+CODEGENOPT(DebugInfoForProfiling, 1, 0)
+CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic
CODEGENOPT(IsPIE, 1, 0) ///< PIE level is the same as PIC Level.
CODEGENOPT(PICLevel, 2, 0) ///< PIC level of the LLVM module.
CODEGENOPT(PrepareForFullLTO , 1, 0) ///< Set when -flto is enabled on the
diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h
index 2b4e823b3fef4..e052250f97e75 100644
--- a/flang/include/flang/Frontend/CodeGenOptions.h
+++ b/flang/include/flang/Frontend/CodeGenOptions.h
@@ -148,6 +148,55 @@ class CodeGenOptions : public CodeGenOptionsBase {
/// OpenMP is enabled.
using DoConcurrentMappingKind = flangomp::DoConcurrentMappingKind;
+ enum ProfileInstrKind {
+ ProfileNone, // Profile instrumentation is turned off.
+ ProfileClangInstr, // Clang instrumentation to generate execution counts
+ // to use with PGO.
+ ProfileIRInstr, // IR level PGO instrumentation in LLVM.
+ ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM.
+ };
+
+
+ /// Name of the profile file to use as output for -fprofile-instr-generate,
+ /// -fprofile-generate, and -fcs-profile-generate.
+ std::string InstrProfileOutput;
+
+ /// Name of the profile file to use as input for -fmemory-profile-use.
+ std::string MemoryProfileUsePath;
+
+ unsigned int DebugInfoForProfiling;
+
+ unsigned int AtomicProfileUpdate;
+
+ /// Name of the profile file to use as input for -fprofile-instr-use
+ std::string ProfileInstrumentUsePath;
+
+ /// Name of the profile remapping file to apply to the profile data supplied
+ /// by -fprofile-sample-use or -fprofile-instr-use.
+ std::string ProfileRemappingFile;
+
+ /// Check if Clang profile instrumenation is on.
+ bool hasProfileClangInstr() const {
+ return getProfileInstr() == ProfileClangInstr;
+ }
+
+ /// Check if IR level profile instrumentation is on.
+ bool hasProfileIRInstr() const {
+ return getProfileInstr() == ProfileIRInstr;
+ }
+
+ /// Check if CS IR level profile instrumentation is on.
+ bool hasProfileCSIRInstr() const {
+ return getProfileInstr() == ProfileCSIRInstr;
+ }
+ /// Check if IR level profile use is on.
+ bool hasProfileIRUse() const {
+ return getProfileUse() == ProfileIRInstr ||
+ getProfileUse() == ProfileCSIRInstr;
+ }
+ /// Check if CSIR profile use is on.
+ bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; }
+
// Define accessors/mutators for code generation options of enumeration type.
#define CODEGENOPT(Name, Bits, Default)
#define ENUM_CODEGENOPT(Name, Type, Bits, Default) \
diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp
index 6f87a18d69c3d..f013fce2f3cfc 100644
--- a/flang/lib/Frontend/CompilerInvocation.cpp
+++ b/flang/lib/Frontend/CompilerInvocation.cpp
@@ -27,6 +27,7 @@
#include "clang/Driver/DriverDiagnostic.h"
#include "clang/Driver/OptionUtils.h"
#include "clang/Driver/Options.h"
+#include "clang/Driver/Driver.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/ADT/StringSwitch.h"
#include "llvm/Frontend/Debug/Options.h"
@@ -431,6 +432,17 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts,
opts.IsPIE = 1;
}
+ if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) {
+ opts.setProfileInstr(Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr);
+ opts.DebugInfoForProfiling = args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling);
+ opts.AtomicProfileUpdate = args.hasArg(clang::driver::options::OPT_fprofile_update_EQ);
+ }
+
+ if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) {
+ opts.setProfileUse(Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr);
+ opts.ProfileInstrumentUsePath = A->getValue();
+ }
+
// -mcmodel option.
if (const llvm::opt::Arg *a =
args.getLastArg(clang::driver::options::OPT_mcmodel_EQ)) {
diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp
index c1f47b12abee2..68880bdeecf8d 100644
--- a/flang/lib/Frontend/FrontendActions.cpp
+++ b/flang/lib/Frontend/FrontendActions.cpp
@@ -63,11 +63,14 @@
#include "llvm/Support/Path.h"
#include "llvm/Support/SourceMgr.h"
#include "llvm/Support/ToolOutputFile.h"
+#include "llvm/Support/PGOOptions.h"
#include "llvm/Target/TargetMachine.h"
#include "llvm/TargetParser/RISCVISAInfo.h"
#include "llvm/TargetParser/RISCVTargetParser.h"
#include "llvm/Transforms/IPO/Internalize.h"
#include "llvm/Transforms/Utils/ModuleUtils.h"
+#include "llvm/Transforms/Instrumentation/InstrProfiling.h"
+#include "llvm/ProfileData/InstrProfCorrelator.h"
#include <memory>
#include <system_error>
@@ -130,6 +133,20 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci,
// Custom BeginSourceFileAction
//===----------------------------------------------------------------------===//
+
+static llvm::cl::opt<llvm::PGOOptions::ColdFuncOpt> ClPGOColdFuncAttr(
+ "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), llvm::cl::Hidden,
+ llvm::cl::desc(
+ "Function attribute to apply to cold functions as determined by PGO"),
+ llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, "default",
+ "Default (no attribute)"),
+ clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, "optsize",
+ "Mark cold functions with optsize."),
+ clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, "minsize",
+ "Mark cold functions with minsize."),
+ clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, "optnone",
+ "Mark cold functions with optnone.")));
+
bool PrescanAction::beginSourceFileAction() { return runPrescan(); }
bool PrescanAndParseAction::beginSourceFileAction() {
@@ -892,6 +909,20 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags,
delete tlii;
}
+
+// Default filename used for profile generation.
+namespace llvm {
+ extern llvm::cl::opt<bool> DebugInfoCorrelate;
+ extern llvm::cl::opt<InstrProfCorrelator::ProfCorrelatorKind> ProfileCorrelate;
+
+
+std::string getDefaultProfileGenName() {
+ return DebugInfoCorrelate || ProfileCorrelate != llvm::InstrProfCorrelator::NONE
+ ? "default_%m.proflite"
+ : "default_%m.profraw";
+}
+}
+
void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) {
CompilerInstance &ci = getInstance();
const CodeGenOptions &opts = ci.getInvocation().getCodeGenOpts();
@@ -909,6 +940,29 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) {
llvm::PassInstrumentationCallbacks pic;
llvm::PipelineTuningOptions pto;
std::optional<llvm::PGOOptions> pgoOpt;
+
+ if (opts.hasProfileIRInstr()){
+ // // -fprofile-generate.
+ pgoOpt = llvm::PGOOptions(
+ opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName()
+ : opts.InstrProfileOutput,
+ "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr,
+ llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr,
+ opts.DebugInfoForProfiling,
+ /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate);
+ }
+ else if (opts.hasProfileIRUse()) {
+ llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem> VFS = llvm::vfs::getRealFileSystem();
+ // -fprofile-use.
+ auto CSAction = opts.hasProfileCSIRUse() ? llvm::PGOOptions::CSIRUse
+ : llvm::PGOOptions::NoCSAction;
+ pgoOpt = llvm::PGOOptions(opts.ProfileInstrumentUsePath, "",
+ opts.ProfileRemappingFile,
+ opts.MemoryProfileUsePath, VFS,
+ llvm::PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr,
+ opts.DebugInfoForProfiling);
+ }
+
llvm::StandardInstrumentations si(llvmModule->getContext(),
opts.DebugPassManager);
si.registerCallbacks(pic, &mam);
diff --git a/flang/test/Driver/flang-f-opts.f90 b/flang/test/Driver/flang-f-opts.f90
index 4493a519e2010..b972b9b7b2a59 100644
--- a/flang/test/Driver/flang-f-opts.f90
+++ b/flang/test/Driver/flang-f-opts.f90
@@ -8,3 +8,8 @@
! CHECK-LABEL: "-fc1"
! CHECK: -ffp-contract=off
! CHECK: -O3
+
+! RUN: %flang -### -S -fprofile-generate %s 2>&1 | FileCheck -check-prefix=CHECK-PROFILE-GENERATE-LLVM %s
+! CHECK-PROFILE-GENERATE-LLVM: "-fprofile-generate"
+! RUN: %flang -### -S -fprofile-use=%S %s 2>&1 | FileCheck -check-prefix=CHECK-PROFILE-USE-DIR %s
+! CHECK-PROFILE-USE-DIR: "-fprofile-use={{.*}}"
diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext
new file mode 100644
index 0000000000000..6a6df8b1d4d5b
--- /dev/null
+++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext
@@ -0,0 +1,19 @@
+# IR level Instrumentation Flag
+:ir
+_QQmain
+# Func Hash:
+146835646621254984
+# Num Counters:
+2
+# Counter Values:
+100
+1
+
+main
+# Func Hash:
+742261418966908927
+# Num Counters:
+1
+# Counter Values:
+1
+
diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext
new file mode 100644
index 0000000000000..9a46140286673
--- /dev/null
+++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext
@@ -0,0 +1,14 @@
+# IR level Instrumentation Flag
+:ir
+:entry_first
+_QQmain
+# Func Hash:
+146835646621254984
+# Num Counters:
+2
+# Counter Values:
+100
+1
+
+
+
diff --git a/flang/test/Profile/gcc-flag-compatibility.f90 b/flang/test/Profile/gcc-flag-compatibility.f90
new file mode 100644
index 0000000000000..0124bc79b87ef
--- /dev/null
+++ b/flang/test/Profile/gcc-flag-compatibility.f90
@@ -0,0 +1,39 @@
+! Tests for -fprofile-generate and -fprofile-use flag compatibility. These two
+! flags behave similarly to their GCC counterparts:
+!
+! -fprofile-generate Generates the profile file ./default.profraw
+! -fprofile-use=<dir>/file Uses the profile file <dir>/file
+
+! On AIX, -flto used to be required with -fprofile-generate. gcc-flag-compatibility-aix.c is used to do the testing on AIX with -flto
+! RUN: %flang %s -c -S -o - -emit-llvm -fprofile-generate | FileCheck -check-prefix=PROFILE-GEN %s
+! PROFILE-GEN: @__profc_{{_?}}main = {{(private|internal)}} global [1 x i64] zeroinitializer, section
+! PROFILE-GEN: @__profd_{{_?}}main =
+
+
+
+! Check that -fprofile-use=some/path/file.prof reads some/path/file.prof
+! This uses LLVM IR format profile.
+! RUN: rm -rf %t.dir
+! RUN: mkdir -p %t.dir/some/path
+! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR.proftext -o %t.dir/some/path/file.prof
+! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR1 %s
+!
+
+
+
+! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR_entry.proftext -o %t.dir/some/path/file.prof
+! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR2 %s
+! PROFILE-USE-IR1: = !{!"branch_weights", i32 100, i32 1}
+! PROFILE-USE-IR2: = !{!"branch_weights", i32 1, i32 100}
+
+
+program main
+ implicit none
+ integer :: i
+ integer :: X = 0
+
+ do i = 0, 99
+ X = X + i
+ end do
+
+end program main
|
The following is the Fortran benchmark test data from speccpu2006
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for contributing this
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this.
@@ -148,6 +148,55 @@ class CodeGenOptions : public CodeGenOptionsBase { | |||
/// OpenMP is enabled. | |||
using DoConcurrentMappingKind = flangomp::DoConcurrentMappingKind; | |||
|
|||
enum ProfileInstrKind { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this enum be shared between clang
and flang
? There is precedent for doing this, for example with the VectorLibrary
enum. That was moved to llvm/include/llvm/Frontend/Driver/CodeGenOptions.h
. We could consider doing the same for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would require the definition of this enum variable to be placed in a public namespace. If I do this, I will need to change the source code of clang. Is this appropriate?
@@ -130,6 +133,20 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci, | |||
// Custom BeginSourceFileAction | |||
//===----------------------------------------------------------------------===// | |||
|
|||
|
|||
static llvm::cl::opt<llvm::PGOOptions::ColdFuncOpt> ClPGOColdFuncAttr( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was added in #89298 and is marked with a TODO in clang/lib/CodeGen/BackendUtil.cpp
saying that it should be removed once a proper frontend driver option is added. @kiranchandramohan , Do we want to be replicating something experimental like this in flang as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently clang has used this option as a parameter to initialize PGOOptions, in order to better implement flang's PGO functionality in the future, I'm currently trying to reproduce all of the functionality of clang's PGO as much as possible, so I'd like to add this option to facilitate this experiment as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a proper frontend driver option to both frontends is the best way forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a proper frontend driver option to both frontends is the best way forward.
Do you mean to let flang and clang reuse this definition? But this code of yours is defined in clang, and I need to let Flang link Clang's BackendUtil.cpp
, which may cause some problems. So I put this definition in llvm namespace : llvm/Frontend/Driver/CodeGenOptions.h
.This way I don't need to copy the code all over again in flang. What do you think about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, I missed this, yeah promoting this to a proper frontend option is fine
extern llvm::cl::opt<InstrProfCorrelator::ProfCorrelatorKind> ProfileCorrelate; | ||
|
||
|
||
std::string getDefaultProfileGenName() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a duplicate of the code in I think this function definition could be moved to llvm/lib/Frontend/Driver/CodeGenOptions.cpp
or somewhere within llvm/lib/Frontend. There is precedent for doing this with, for example, createTLII
. In general, we would like to avoid duplicating code from clang
as much as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your review.
The getDefaultProfileGenName
function in the original clang is defined in an anonymous namespace and cannot be directly reused in flang. If I reuse this function, I may need to change the source code of clang, is this appropriate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can change the source of clang to move that function to e.g. llvm/lib/Frontend/Driver/CodeGenOptions.cpp
and use it from that location from both clang and flang. We've done that in a few different places for options like these (like createTLII
as @tarunprabhu mentioned)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can change the source of clang to move that function to e.g.
llvm/lib/Frontend/Driver/CodeGenOptions.cpp
and use it from that location from both clang and flang. We've done that in a few different places for options like these (likecreateTLII
as @tarunprabhu mentioned)
Sorry, I've ported this function to the llvm namespace and forgot to remove the code here, I've now removed this code
# Counter Values: | ||
100 | ||
1 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the trailing newlines here necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the tip, I've removed the extra newlines
Remove redundant comment symbols Co-authored-by: Tom Eccles <[email protected]>
There is precedent for changing the clang source in order to share code or to augment it for use with flang. For what needs to be done here, I think it should be fine. We will have to request reviews from the clang developers as well. |
Since this PR focuses on enabling PGO support in Flang, and changes to the Clang source code may require broader discussion and review, I intend to scope this PR to Flang's features. As you suggested, I've implemented it locally, and I'm going to propose a another PR to make the flang and clang share code |
If I understand what you are saying correctly, this would involve, for example, duplicating the definition of |
Thanks for the advice, I agree with you, I'm currently refining the codes |
… definition from clang to the llvm namespace to allow flang to reuse these code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updating the PR.
There seems to be a buildkite failure. These failures are often worth checking because it might indicate that your changes might be incompatible with a different build configuration from yours.
@@ -28,6 +28,7 @@ | |||
#include "flang/Semantics/unparse-with-symbols.h" | |||
#include "flang/Support/default-kinds.h" | |||
#include "flang/Tools/CrossToolHelpers.h" | |||
#include "clang/CodeGen/BackendUtil.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not include clang headers unless necessary. If this was to obtain the declaration of ClPGOColdFuncAttr
, it may be better to expose it in llvm/Frontend/CodeGenOptions.h
and include that header here instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates
#include "llvm/Support/CommandLine.h" | ||
#include "llvm/Support/PGOOptions.h" | ||
|
||
#include "clang/Basic/LLVM.h" | ||
#include "llvm/IR/ModuleSummaryIndex.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: include ordering
https://llvm.org/docs/CodingStandards.html#include-style
namespace llvm { | ||
extern cl::opt<llvm::PGOOptions::ColdFuncOpt> ClPGOColdFuncAttr; | ||
} // namespace llvm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I would put this extern
in flang/lib/Frontend/FrontendActions.cpp
because there is no need for this implementation detail to be part of the public interface.
@@ -19,6 +21,7 @@ template <typename T> class Expected; | |||
template <typename T> class IntrusiveRefCntPtr; | |||
class Module; | |||
class MemoryBufferRef; | |||
extern cl::opt<llvm::PGOOptions::ColdFuncOpt> ClPGOColdFuncAttr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I would put this extern
in BackendUtil.cpp
because it is not part of BackendUtils's public interface to other translation units.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank,As you suggested, I put the definition of using it in the cpp
…gShan/llvm-project into OpenXiangShan/flang-new-add-PGO # Conflicts: # flang/lib/Frontend/FrontendActions.cpp resolved by OpenXiangShan/flang-new-add-PGO version # llvm/include/llvm/Frontend/Driver/CodeGenOptions.h resolved by OpenXiangShan/flang-new-add-PGO version # llvm/lib/Frontend/Driver/CodeGenOptions.cpp resolved by OpenXiangShan/flang-new-add-PGO version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making all the changes. Just a few minor comments.
@@ -13,6 +13,7 @@ | |||
#ifndef LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H | |||
#define LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H | |||
|
|||
#include <string> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An empty line should separate the last include and the namespace below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok,I have adjusted it as you suggested
! PROFILE-GEN: @__profc_{{_?}}main = {{(private|internal)}} global [1 x i64] zeroinitializer, section | ||
! PROFILE-GEN: @__profd_{{_?}}main = | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason for leaving 3 empty lines here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's okay, I referenced clang for this file, and I adjusted the code layout to keep it looking nice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// to use with PGO. | ||
ProfileIRInstr, // IR level PGO instrumentation in LLVM. | ||
ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM. | ||
}; | ||
TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Newline here too please.
This patch implements IR-based Profile-Guided Optimization support in Flang through the following flags:
-fprofile-generate
for instrumentation-based profile generation-fprofile-use=<dir>/file
for profile-guided optimizationResolves #74216 (implements IR PGO support phase)
Key changes:
Frontend flag handling aligned with Clang/GCC semantics
Instrumentation hooks into LLVM PGO infrastructure
LIT tests verifying:
Instrumentation metadata generation
Profile loading from specified path
Branch weight attribution (IR checks)
Tests:
Added gcc-flag-compatibility.f90 test module verifying:
Flag parsing boundary conditions
IR-level profile annotation consistency
Profile input path normalization rules
SPEC2006 benchmark results will be shared in comments
For details on LLVM's PGO framework, refer to Clang PGO Documentation.
This implementation was developed by XSCC Compiler Team.