Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler #136098

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

fanju110
Copy link

This patch implements IR-based Profile-Guided Optimization support in Flang through the following flags:

  • -fprofile-generate for instrumentation-based profile generation

  • -fprofile-use=<dir>/file for profile-guided optimization

Resolves #74216 (implements IR PGO support phase)

Key changes:

  • Frontend flag handling aligned with Clang/GCC semantics

  • Instrumentation hooks into LLVM PGO infrastructure

  • LIT tests verifying:

    • Instrumentation metadata generation

    • Profile loading from specified path

    • Branch weight attribution (IR checks)

Tests:

  • Added gcc-flag-compatibility.f90 test module verifying:

    • Flag parsing boundary conditions

    • IR-level profile annotation consistency

    • Profile input path normalization rules

  • SPEC2006 benchmark results will be shared in comments

For details on LLVM's PGO framework, refer to Clang PGO Documentation.

This implementation was developed by XSCC Compiler Team.

This patch implements IR-based Profile-Guided Optimization support in Flang through the following flags:
-fprofile-generate for instrumentation-based profile generation
-fprofile-use=<dir>/file for profile-guided optimization

Co-Authored-By: ict-ql <[email protected]>
Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' flang:driver flang Flang issues not falling into any other category labels Apr 17, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 17, 2025

@llvm/pr-subscribers-clang-codegen
@llvm/pr-subscribers-clang-driver

@llvm/pr-subscribers-flang-driver

Author: FYK (fanju110)

Changes

This patch implements IR-based Profile-Guided Optimization support in Flang through the following flags:

  • -fprofile-generate for instrumentation-based profile generation

  • -fprofile-use=&lt;dir&gt;/file for profile-guided optimization

Resolves #74216 (implements IR PGO support phase)

Key changes:

  • Frontend flag handling aligned with Clang/GCC semantics

  • Instrumentation hooks into LLVM PGO infrastructure

  • LIT tests verifying:

    • Instrumentation metadata generation

    • Profile loading from specified path

    • Branch weight attribution (IR checks)

Tests:

  • Added gcc-flag-compatibility.f90 test module verifying:

    • Flag parsing boundary conditions

    • IR-level profile annotation consistency

    • Profile input path normalization rules

  • SPEC2006 benchmark results will be shared in comments

For details on LLVM's PGO framework, refer to Clang PGO Documentation.

This implementation was developed by XSCC Compiler Team.


Full diff: https://github.com/llvm/llvm-project/pull/136098.diff

10 Files Affected:

  • (modified) clang/include/clang/Driver/Options.td (+2-2)
  • (modified) clang/lib/Driver/ToolChains/Flang.cpp (+8)
  • (modified) flang/include/flang/Frontend/CodeGenOptions.def (+5)
  • (modified) flang/include/flang/Frontend/CodeGenOptions.h (+49)
  • (modified) flang/lib/Frontend/CompilerInvocation.cpp (+12)
  • (modified) flang/lib/Frontend/FrontendActions.cpp (+54)
  • (modified) flang/test/Driver/flang-f-opts.f90 (+5)
  • (added) flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext (+19)
  • (added) flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext (+14)
  • (added) flang/test/Profile/gcc-flag-compatibility.f90 (+39)
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index affc076a876ad..0b0dbc467c1e0 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1756,7 +1756,7 @@ def fmcdc_max_test_vectors_EQ : Joined<["-"], "fmcdc-max-test-vectors=">,
   HelpText<"Maximum number of test vectors in MC/DC coverage">,
   MarshallingInfoInt<CodeGenOpts<"MCDCMaxTVs">, "0x7FFFFFFE">;
 def fprofile_generate : Flag<["-"], "fprofile-generate">,
-    Group<f_Group>, Visibility<[ClangOption, CLOption]>,
+    Group<f_Group>, Visibility<[ClangOption, CLOption, FlangOption, FC1Option]>,
     HelpText<"Generate instrumented code to collect execution counts into default.profraw (overridden by LLVM_PROFILE_FILE env var)">;
 def fprofile_generate_EQ : Joined<["-"], "fprofile-generate=">,
     Group<f_Group>, Visibility<[ClangOption, CLOption]>,
@@ -1773,7 +1773,7 @@ def fprofile_use : Flag<["-"], "fprofile-use">, Group<f_Group>,
     Visibility<[ClangOption, CLOption]>, Alias<fprofile_instr_use>;
 def fprofile_use_EQ : Joined<["-"], "fprofile-use=">,
     Group<f_Group>,
-    Visibility<[ClangOption, CLOption]>,
+    Visibility<[ClangOption, CLOption, FlangOption, FC1Option]>,
     MetaVarName<"<pathname>">,
     HelpText<"Use instrumentation data for profile-guided optimization. If pathname is a directory, it reads from <pathname>/default.profdata. Otherwise, it reads from file <pathname>.">;
 def fno_profile_instr_generate : Flag<["-"], "fno-profile-instr-generate">,
diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp
index a8b4688aed09c..fcdbe8a6aba5a 100644
--- a/clang/lib/Driver/ToolChains/Flang.cpp
+++ b/clang/lib/Driver/ToolChains/Flang.cpp
@@ -882,6 +882,14 @@ void Flang::ConstructJob(Compilation &C, const JobAction &JA,
   // TODO: Handle interactions between -w, -pedantic, -Wall, -WOption
   Args.AddLastArg(CmdArgs, options::OPT_w);
 
+
+  if (Args.hasArg(options::OPT_fprofile_generate)){ 
+    CmdArgs.push_back("-fprofile-generate");
+  }
+  if (const Arg *A = Args.getLastArg(options::OPT_fprofile_use_EQ)) {
+    CmdArgs.push_back(Args.MakeArgString(std::string("-fprofile-use=") + A->getValue()));
+  }
+
   // Forward flags for OpenMP. We don't do this if the current action is an
   // device offloading action other than OpenMP.
   if (Args.hasFlag(options::OPT_fopenmp, options::OPT_fopenmp_EQ,
diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def
index 57830bf51a1b3..4dec86cd8f51b 100644
--- a/flang/include/flang/Frontend/CodeGenOptions.def
+++ b/flang/include/flang/Frontend/CodeGenOptions.def
@@ -24,6 +24,11 @@ CODEGENOPT(OptimizationLevel, 2, 0) ///< The -O[0-3] option specified.
 CODEGENOPT(DebugPassManager, 1, 0) ///< Prints debug information for the new
                                    ///< pass manager.
 
+ENUM_CODEGENOPT(ProfileInstr, ProfileInstrKind, 2, ProfileNone)
+ENUM_CODEGENOPT(ProfileUse, ProfileInstrKind, 2, ProfileNone)
+/// Whether emit extra debug info for sample pgo profile collection.
+CODEGENOPT(DebugInfoForProfiling, 1, 0)
+CODEGENOPT(AtomicProfileUpdate , 1, 0) ///< Set -fprofile-update=atomic
 CODEGENOPT(IsPIE, 1, 0) ///< PIE level is the same as PIC Level.
 CODEGENOPT(PICLevel, 2, 0) ///< PIC level of the LLVM module.
 CODEGENOPT(PrepareForFullLTO , 1, 0) ///< Set when -flto is enabled on the
diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h
index 2b4e823b3fef4..e052250f97e75 100644
--- a/flang/include/flang/Frontend/CodeGenOptions.h
+++ b/flang/include/flang/Frontend/CodeGenOptions.h
@@ -148,6 +148,55 @@ class CodeGenOptions : public CodeGenOptionsBase {
   /// OpenMP is enabled.
   using DoConcurrentMappingKind = flangomp::DoConcurrentMappingKind;
 
+  enum ProfileInstrKind {
+    ProfileNone,       // Profile instrumentation is turned off.
+    ProfileClangInstr, // Clang instrumentation to generate execution counts
+                       // to use with PGO.
+    ProfileIRInstr,    // IR level PGO instrumentation in LLVM.
+    ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM.
+  };
+
+
+  /// Name of the profile file to use as output for -fprofile-instr-generate,
+  /// -fprofile-generate, and -fcs-profile-generate.
+  std::string InstrProfileOutput;
+
+  /// Name of the profile file to use as input for -fmemory-profile-use.
+  std::string MemoryProfileUsePath;
+
+  unsigned int DebugInfoForProfiling;
+
+  unsigned int AtomicProfileUpdate;
+
+  /// Name of the profile file to use as input for -fprofile-instr-use
+  std::string ProfileInstrumentUsePath;
+
+    /// Name of the profile remapping file to apply to the profile data supplied
+  /// by -fprofile-sample-use or -fprofile-instr-use.
+  std::string ProfileRemappingFile;
+
+  /// Check if Clang profile instrumenation is on.
+  bool hasProfileClangInstr() const {
+    return getProfileInstr() == ProfileClangInstr;
+  }
+
+  /// Check if IR level profile instrumentation is on.
+  bool hasProfileIRInstr() const {
+    return getProfileInstr() == ProfileIRInstr;
+  }
+
+  /// Check if CS IR level profile instrumentation is on.
+  bool hasProfileCSIRInstr() const {
+    return getProfileInstr() == ProfileCSIRInstr;
+  }
+    /// Check if IR level profile use is on.
+    bool hasProfileIRUse() const {
+      return getProfileUse() == ProfileIRInstr ||
+             getProfileUse() == ProfileCSIRInstr;
+    }
+  /// Check if CSIR profile use is on.
+  bool hasProfileCSIRUse() const { return getProfileUse() == ProfileCSIRInstr; }
+
   // Define accessors/mutators for code generation options of enumeration type.
 #define CODEGENOPT(Name, Bits, Default)
 #define ENUM_CODEGENOPT(Name, Type, Bits, Default)                             \
diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp
index 6f87a18d69c3d..f013fce2f3cfc 100644
--- a/flang/lib/Frontend/CompilerInvocation.cpp
+++ b/flang/lib/Frontend/CompilerInvocation.cpp
@@ -27,6 +27,7 @@
 #include "clang/Driver/DriverDiagnostic.h"
 #include "clang/Driver/OptionUtils.h"
 #include "clang/Driver/Options.h"
+#include "clang/Driver/Driver.h"
 #include "llvm/ADT/StringRef.h"
 #include "llvm/ADT/StringSwitch.h"
 #include "llvm/Frontend/Debug/Options.h"
@@ -431,6 +432,17 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts,
       opts.IsPIE = 1;
   }
 
+  if (args.hasArg(clang::driver::options::OPT_fprofile_generate)) {
+    opts.setProfileInstr(Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr);
+    opts.DebugInfoForProfiling = args.hasArg(clang::driver::options::OPT_fdebug_info_for_profiling);
+    opts.AtomicProfileUpdate = args.hasArg(clang::driver::options::OPT_fprofile_update_EQ);
+  }
+ 
+  if (auto A = args.getLastArg(clang::driver::options::OPT_fprofile_use_EQ)) {
+    opts.setProfileUse(Fortran::frontend::CodeGenOptions::ProfileInstrKind::ProfileIRInstr);
+    opts.ProfileInstrumentUsePath = A->getValue();
+  }
+
   // -mcmodel option.
   if (const llvm::opt::Arg *a =
           args.getLastArg(clang::driver::options::OPT_mcmodel_EQ)) {
diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp
index c1f47b12abee2..68880bdeecf8d 100644
--- a/flang/lib/Frontend/FrontendActions.cpp
+++ b/flang/lib/Frontend/FrontendActions.cpp
@@ -63,11 +63,14 @@
 #include "llvm/Support/Path.h"
 #include "llvm/Support/SourceMgr.h"
 #include "llvm/Support/ToolOutputFile.h"
+#include "llvm/Support/PGOOptions.h"
 #include "llvm/Target/TargetMachine.h"
 #include "llvm/TargetParser/RISCVISAInfo.h"
 #include "llvm/TargetParser/RISCVTargetParser.h"
 #include "llvm/Transforms/IPO/Internalize.h"
 #include "llvm/Transforms/Utils/ModuleUtils.h"
+#include "llvm/Transforms/Instrumentation/InstrProfiling.h"
+#include "llvm/ProfileData/InstrProfCorrelator.h"
 #include <memory>
 #include <system_error>
 
@@ -130,6 +133,20 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci,
 // Custom BeginSourceFileAction
 //===----------------------------------------------------------------------===//
 
+
+static llvm::cl::opt<llvm::PGOOptions::ColdFuncOpt> ClPGOColdFuncAttr(
+  "pgo-cold-func-opt", llvm::cl::init(llvm::PGOOptions::ColdFuncOpt::Default), llvm::cl::Hidden,
+  llvm::cl::desc(
+      "Function attribute to apply to cold functions as determined by PGO"),
+    llvm::cl::values(clEnumValN(llvm::PGOOptions::ColdFuncOpt::Default, "default",
+                        "Default (no attribute)"),
+             clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptSize, "optsize",
+                        "Mark cold functions with optsize."),
+             clEnumValN(llvm::PGOOptions::ColdFuncOpt::MinSize, "minsize",
+                        "Mark cold functions with minsize."),
+             clEnumValN(llvm::PGOOptions::ColdFuncOpt::OptNone, "optnone",
+                        "Mark cold functions with optnone.")));
+
 bool PrescanAction::beginSourceFileAction() { return runPrescan(); }
 
 bool PrescanAndParseAction::beginSourceFileAction() {
@@ -892,6 +909,20 @@ static void generateMachineCodeOrAssemblyImpl(clang::DiagnosticsEngine &diags,
   delete tlii;
 }
 
+
+// Default filename used for profile generation.
+namespace llvm {
+  extern llvm::cl::opt<bool> DebugInfoCorrelate;
+  extern llvm::cl::opt<InstrProfCorrelator::ProfCorrelatorKind> ProfileCorrelate;
+
+
+std::string getDefaultProfileGenName() {
+  return DebugInfoCorrelate || ProfileCorrelate != llvm::InstrProfCorrelator::NONE
+             ? "default_%m.proflite"
+             : "default_%m.profraw";
+}
+}
+
 void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) {
   CompilerInstance &ci = getInstance();
   const CodeGenOptions &opts = ci.getInvocation().getCodeGenOpts();
@@ -909,6 +940,29 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) {
   llvm::PassInstrumentationCallbacks pic;
   llvm::PipelineTuningOptions pto;
   std::optional<llvm::PGOOptions> pgoOpt;
+ 
+  if (opts.hasProfileIRInstr()){
+    // // -fprofile-generate.
+    pgoOpt = llvm::PGOOptions(
+      opts.InstrProfileOutput.empty() ? llvm::getDefaultProfileGenName()
+                                             : opts.InstrProfileOutput,
+      "", "", opts.MemoryProfileUsePath, nullptr, llvm::PGOOptions::IRInstr,
+      llvm::PGOOptions::NoCSAction, ClPGOColdFuncAttr,
+      opts.DebugInfoForProfiling,
+      /*PseudoProbeForProfiling=*/false, opts.AtomicProfileUpdate);
+    }
+    else if (opts.hasProfileIRUse()) {
+      llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem> VFS = llvm::vfs::getRealFileSystem();
+      // -fprofile-use.
+      auto CSAction = opts.hasProfileCSIRUse() ? llvm::PGOOptions::CSIRUse
+                                                      : llvm::PGOOptions::NoCSAction;
+      pgoOpt = llvm::PGOOptions(opts.ProfileInstrumentUsePath, "",
+        opts.ProfileRemappingFile,
+        opts.MemoryProfileUsePath, VFS,
+                          llvm::PGOOptions::IRUse, CSAction, ClPGOColdFuncAttr,
+                          opts.DebugInfoForProfiling);
+    }
+ 
   llvm::StandardInstrumentations si(llvmModule->getContext(),
                                     opts.DebugPassManager);
   si.registerCallbacks(pic, &mam);
diff --git a/flang/test/Driver/flang-f-opts.f90 b/flang/test/Driver/flang-f-opts.f90
index 4493a519e2010..b972b9b7b2a59 100644
--- a/flang/test/Driver/flang-f-opts.f90
+++ b/flang/test/Driver/flang-f-opts.f90
@@ -8,3 +8,8 @@
 ! CHECK-LABEL: "-fc1"
 ! CHECK: -ffp-contract=off
 ! CHECK: -O3
+
+! RUN: %flang -### -S -fprofile-generate %s 2>&1 | FileCheck -check-prefix=CHECK-PROFILE-GENERATE-LLVM %s
+! CHECK-PROFILE-GENERATE-LLVM: "-fprofile-generate"
+! RUN: %flang -### -S -fprofile-use=%S %s 2>&1 | FileCheck -check-prefix=CHECK-PROFILE-USE-DIR %s
+! CHECK-PROFILE-USE-DIR: "-fprofile-use={{.*}}"
diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext
new file mode 100644
index 0000000000000..6a6df8b1d4d5b
--- /dev/null
+++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR.proftext
@@ -0,0 +1,19 @@
+# IR level Instrumentation Flag
+:ir
+_QQmain
+# Func Hash:
+146835646621254984
+# Num Counters:
+2
+# Counter Values:
+100
+1
+
+main
+# Func Hash:
+742261418966908927
+# Num Counters:
+1
+# Counter Values:
+1
+
diff --git a/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext
new file mode 100644
index 0000000000000..9a46140286673
--- /dev/null
+++ b/flang/test/Profile/Inputs/gcc-flag-compatibility_IR_entry.proftext
@@ -0,0 +1,14 @@
+# IR level Instrumentation Flag
+:ir
+:entry_first
+_QQmain
+# Func Hash:
+146835646621254984
+# Num Counters:
+2
+# Counter Values:
+100
+1
+
+
+
diff --git a/flang/test/Profile/gcc-flag-compatibility.f90 b/flang/test/Profile/gcc-flag-compatibility.f90
new file mode 100644
index 0000000000000..0124bc79b87ef
--- /dev/null
+++ b/flang/test/Profile/gcc-flag-compatibility.f90
@@ -0,0 +1,39 @@
+! Tests for -fprofile-generate and -fprofile-use flag compatibility. These two
+! flags behave similarly to their GCC counterparts:
+!
+! -fprofile-generate         Generates the profile file ./default.profraw
+! -fprofile-use=<dir>/file   Uses the profile file <dir>/file
+
+! On AIX, -flto used to be required with -fprofile-generate. gcc-flag-compatibility-aix.c is used to do the testing on AIX with -flto
+! RUN: %flang %s -c -S -o - -emit-llvm -fprofile-generate | FileCheck -check-prefix=PROFILE-GEN %s
+! PROFILE-GEN: @__profc_{{_?}}main = {{(private|internal)}} global [1 x i64] zeroinitializer, section
+! PROFILE-GEN: @__profd_{{_?}}main =
+
+
+
+! Check that -fprofile-use=some/path/file.prof reads some/path/file.prof
+! This uses LLVM IR format profile.
+! RUN: rm -rf %t.dir
+! RUN: mkdir -p %t.dir/some/path
+! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR.proftext -o %t.dir/some/path/file.prof
+! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR1 %s
+!
+
+
+
+! RUN: llvm-profdata merge %S/Inputs/gcc-flag-compatibility_IR_entry.proftext -o %t.dir/some/path/file.prof
+! RUN: %flang %s -o - -emit-llvm -S -fprofile-use=%t.dir/some/path/file.prof | FileCheck -check-prefix=PROFILE-USE-IR2 %s
+! PROFILE-USE-IR1: = !{!"branch_weights", i32 100, i32 1}
+! PROFILE-USE-IR2: = !{!"branch_weights", i32 1, i32 100}
+
+
+program main
+  implicit none
+  integer :: i
+  integer :: X = 0
+
+  do i = 0, 99
+     X = X + i
+  end do
+
+end program main

@fanju110
Copy link
Author

The following is the Fortran benchmark test data from speccpu2006

Runtime without PGO (Sec) Runtime with PGO (Sec) Speedup
410.bwaves 101 97.6 1.03
416.gamess 259 244 1.06
434.zeusmp 88.8 91 0.98
437.leslie3d 94.7 94.1 1.01
454.calculix 182 180 1.01
459.GemsFDTD 176 187 0.94
465.tonto 118 124 0.95
481.wrf 93.4 91.4 1.02
  • Complier: LLVM 20.1.0

  • Options:

    • without PGO:
      • -O3 -flto -ffast-math
    • with PGO:
      • -O3 -flto -ffast-math -fprofile-generate
      • -O3 -flto -ffast-math -fprofile-use=/file
  • Hardware: 11th Gen Intel(R) Core(TM) i9-11900K @ 3.50GHz

  • OS: Ubuntu20.04.6 LTS (Focal Fossa)

Copy link
Contributor

@tblah tblah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for contributing this

@tblah tblah requested a review from tarunprabhu April 22, 2025 11:01
Copy link

github-actions bot commented Apr 22, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

@tarunprabhu tarunprabhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this.

@@ -148,6 +148,55 @@ class CodeGenOptions : public CodeGenOptionsBase {
/// OpenMP is enabled.
using DoConcurrentMappingKind = flangomp::DoConcurrentMappingKind;

enum ProfileInstrKind {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this enum be shared between clang and flang? There is precedent for doing this, for example with the VectorLibrary enum. That was moved to llvm/include/llvm/Frontend/Driver/CodeGenOptions.h. We could consider doing the same for this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would require the definition of this enum variable to be placed in a public namespace. If I do this, I will need to change the source code of clang. Is this appropriate?

@@ -130,6 +133,20 @@ static bool saveMLIRTempFile(const CompilerInvocation &ci,
// Custom BeginSourceFileAction
//===----------------------------------------------------------------------===//


static llvm::cl::opt<llvm::PGOOptions::ColdFuncOpt> ClPGOColdFuncAttr(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was added in #89298 and is marked with a TODO in clang/lib/CodeGen/BackendUtil.cpp saying that it should be removed once a proper frontend driver option is added. @kiranchandramohan , Do we want to be replicating something experimental like this in flang as well?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently clang has used this option as a parameter to initialize PGOOptions, in order to better implement flang's PGO functionality in the future, I'm currently trying to reproduce all of the functionality of clang's PGO as much as possible, so I'd like to add this option to facilitate this experiment as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a proper frontend driver option to both frontends is the best way forward.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aeubanks, do you think it is ok to promote -pgo-cold-func-opt= (introduced in #89298) to a frontend option that can be shared between clang and flang, or is the option still "experimental"?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a proper frontend driver option to both frontends is the best way forward.

Do you mean to let flang and clang reuse this definition? But this code of yours is defined in clang, and I need to let Flang link Clang's BackendUtil.cpp, which may cause some problems. So I put this definition in llvm namespace : llvm/Frontend/Driver/CodeGenOptions.h.This way I don't need to copy the code all over again in flang. What do you think about this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I missed this, yeah promoting this to a proper frontend option is fine

extern llvm::cl::opt<InstrProfCorrelator::ProfCorrelatorKind> ProfileCorrelate;


std::string getDefaultProfileGenName() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a duplicate of the code in I think this function definition could be moved to llvm/lib/Frontend/Driver/CodeGenOptions.cpp or somewhere within llvm/lib/Frontend. There is precedent for doing this with, for example, createTLII. In general, we would like to avoid duplicating code from clang as much as possible.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your review.

The getDefaultProfileGenName function in the original clang is defined in an anonymous namespace and cannot be directly reused in flang. If I reuse this function, I may need to change the source code of clang, is this appropriate?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can change the source of clang to move that function to e.g. llvm/lib/Frontend/Driver/CodeGenOptions.cpp and use it from that location from both clang and flang. We've done that in a few different places for options like these (like createTLII as @tarunprabhu mentioned)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can change the source of clang to move that function to e.g. llvm/lib/Frontend/Driver/CodeGenOptions.cpp and use it from that location from both clang and flang. We've done that in a few different places for options like these (like createTLII as @tarunprabhu mentioned)

Sorry, I've ported this function to the llvm namespace and forgot to remove the code here, I've now removed this code

# Counter Values:
100
1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the trailing newlines here necessary?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the tip, I've removed the extra newlines

Remove redundant comment symbols

Co-authored-by: Tom Eccles <[email protected]>
@tarunprabhu
Copy link
Contributor

There is precedent for changing the clang source in order to share code or to augment it for use with flang. For what needs to be done here, I think it should be fine. We will have to request reviews from the clang developers as well.

@fanju110
Copy link
Author

There is precedent for changing the clang source in order to share code or to augment it for use with flang. For what needs to be done here, I think it should be fine. We will have to request reviews from the clang developers as well.

Since this PR focuses on enabling PGO support in Flang, and changes to the Clang source code may require broader discussion and review, I intend to scope this PR to Flang's features. As you suggested, I've implemented it locally, and I'm going to propose a another PR to make the flang and clang share code

@tarunprabhu
Copy link
Contributor

If I understand what you are saying correctly, this would involve, for example, duplicating the definition of enum ProfileInstrKind in flang in this PR, then a second PR that moves it from both flang and clang to llvm/Frontend. This is churn that can be distracting. The changes to clang in this case are unlikely to be major and getting it reviewed and approved should not be difficult. I would suggest doing those in this PR.

@fanju110
Copy link
Author

If I understand what you are saying correctly, this would involve, for example, duplicating the definition of enum ProfileInstrKind in flang in this PR, then a second PR that moves it from both flang and clang to llvm/Frontend. This is churn that can be distracting. The changes to clang in this case are unlikely to be major and getting it reviewed and approved should not be difficult. I would suggest doing those in this PR.

Thanks for the advice, I agree with you, I'm currently refining the codes

… definition from clang to the llvm namespace to allow flang to reuse these code.
@llvmbot llvmbot added clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. labels Apr 28, 2025
Copy link
Contributor

@tarunprabhu tarunprabhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updating the PR.

There seems to be a buildkite failure. These failures are often worth checking because it might indicate that your changes might be incompatible with a different build configuration from yours.

@@ -28,6 +28,7 @@
#include "flang/Semantics/unparse-with-symbols.h"
#include "flang/Support/default-kinds.h"
#include "flang/Tools/CrossToolHelpers.h"
#include "clang/CodeGen/BackendUtil.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not include clang headers unless necessary. If this was to obtain the declaration of ClPGOColdFuncAttr, it may be better to expose it in llvm/Frontend/CodeGenOptions.h and include that header here instead

Copy link
Contributor

@tblah tblah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates

Comment on lines 11 to 15
#include "llvm/Support/CommandLine.h"
#include "llvm/Support/PGOOptions.h"

#include "clang/Basic/LLVM.h"
#include "llvm/IR/ModuleSummaryIndex.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 27 to 29
namespace llvm {
extern cl::opt<llvm::PGOOptions::ColdFuncOpt> ClPGOColdFuncAttr;
} // namespace llvm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would put this extern in flang/lib/Frontend/FrontendActions.cpp because there is no need for this implementation detail to be part of the public interface.

@@ -19,6 +21,7 @@ template <typename T> class Expected;
template <typename T> class IntrusiveRefCntPtr;
class Module;
class MemoryBufferRef;
extern cl::opt<llvm::PGOOptions::ColdFuncOpt> ClPGOColdFuncAttr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would put this extern in BackendUtil.cpp because it is not part of BackendUtils's public interface to other translation units.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank,As you suggested, I put the definition of using it in the cpp

…gShan/llvm-project into OpenXiangShan/flang-new-add-PGO

# Conflicts:
#	flang/lib/Frontend/FrontendActions.cpp   resolved by OpenXiangShan/flang-new-add-PGO version
#	llvm/include/llvm/Frontend/Driver/CodeGenOptions.h   resolved by OpenXiangShan/flang-new-add-PGO version
#	llvm/lib/Frontend/Driver/CodeGenOptions.cpp   resolved by OpenXiangShan/flang-new-add-PGO version
Copy link
Contributor

@tarunprabhu tarunprabhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making all the changes. Just a few minor comments.

@@ -13,6 +13,7 @@
#ifndef LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H
#define LLVM_FRONTEND_DRIVER_CODEGENOPTIONS_H

#include <string>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An empty line should separate the last include and the namespace below

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok,I have adjusted it as you suggested

! PROFILE-GEN: @__profc_{{_?}}main = {{(private|internal)}} global [1 x i64] zeroinitializer, section
! PROFILE-GEN: @__profd_{{_?}}main =


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for leaving 3 empty lines here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's okay, I referenced clang for this file, and I adjusted the code layout to keep it looking nice

Copy link
Contributor

@tarunprabhu tarunprabhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for seeing this through and making all the little changes. I have requested reviews from @MaskRay and @aeubanks for the clang side of things.

// to use with PGO.
ProfileIRInstr, // IR level PGO instrumentation in LLVM.
ProfileCSIRInstr, // IR level PGO context sensitive instrumentation in LLVM.
};
TargetLibraryInfoImpl *createTLII(const llvm::Triple &TargetTriple,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Newline here too please.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category flang:driver flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Profile-Guided Optimization (PGO) support to the Flang compiler
7 participants