-
Notifications
You must be signed in to change notification settings - Fork 13.4k
[ScheduleDAG] Allow disabling the SchedModel / Itineraries during Scheduling #138057
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Change-Id: I34b84c83b5de73a93911641a26a4260f156128d6
@llvm/pr-subscribers-backend-amdgpu Author: Jeffrey Byrnes (jrbyrnes) ChangesThis provides the We have the Full diff: https://github.com/llvm/llvm-project/pull/138057.diff 5 Files Affected:
diff --git a/llvm/include/llvm/CodeGen/TargetSchedule.h b/llvm/include/llvm/CodeGen/TargetSchedule.h
index bfe4234abf8eb..0314940cbafd5 100644
--- a/llvm/include/llvm/CodeGen/TargetSchedule.h
+++ b/llvm/include/llvm/CodeGen/TargetSchedule.h
@@ -45,6 +45,8 @@ class TargetSchedModel {
unsigned computeInstrLatency(const MCSchedClassDesc &SCDesc) const;
+ bool DisableItinerariesAndSchedModel = false;
+
public:
TargetSchedModel() : SchedModel(MCSchedModel::Default) {}
@@ -53,7 +55,8 @@ class TargetSchedModel {
/// The machine model API keeps a copy of the top-level MCSchedModel table
/// indices and may query TargetSubtargetInfo and TargetInstrInfo to resolve
/// dynamic properties.
- void init(const TargetSubtargetInfo *TSInfo);
+ void init(const TargetSubtargetInfo *TSInfo,
+ bool DisableItinerariesAndSchedModel = false);
/// Return the MCSchedClassDesc for this instruction.
const MCSchedClassDesc *resolveSchedClass(const MachineInstr *MI) const;
diff --git a/llvm/lib/CodeGen/ScheduleDAGInstrs.cpp b/llvm/lib/CodeGen/ScheduleDAGInstrs.cpp
index a26804707dd1f..c6d3a0be1dfa5 100644
--- a/llvm/lib/CodeGen/ScheduleDAGInstrs.cpp
+++ b/llvm/lib/CodeGen/ScheduleDAGInstrs.cpp
@@ -69,6 +69,10 @@ static cl::opt<bool>
static cl::opt<bool> UseTBAA("use-tbaa-in-sched-mi", cl::Hidden,
cl::init(true), cl::desc("Enable use of TBAA during MI DAG construction"));
+static cl::opt<bool> DisableSchedModel(
+ "disable-schedmodel-in-sched-mi", cl::Hidden, cl::init(false),
+ cl::desc("Enable use of TBAA during MI DAG construction"));
+
// Note: the two options below might be used in tuning compile time vs
// output quality. Setting HugeRegion so large that it will never be
// reached means best-effort, but may be slow.
@@ -121,7 +125,7 @@ ScheduleDAGInstrs::ScheduleDAGInstrs(MachineFunction &mf,
DbgValues.clear();
const TargetSubtargetInfo &ST = mf.getSubtarget();
- SchedModel.init(&ST);
+ SchedModel.init(&ST, DisableSchedModel);
}
/// If this machine instr has memory reference information and it can be
diff --git a/llvm/lib/CodeGen/TargetSchedule.cpp b/llvm/lib/CodeGen/TargetSchedule.cpp
index db884b4940395..98cbeed9f03a3 100644
--- a/llvm/lib/CodeGen/TargetSchedule.cpp
+++ b/llvm/lib/CodeGen/TargetSchedule.cpp
@@ -40,19 +40,23 @@ static cl::opt<bool> ForceEnableIntervals(
cl::desc("Force the use of resource intervals in the schedule model"));
bool TargetSchedModel::hasInstrSchedModel() const {
- return EnableSchedModel && SchedModel.hasInstrSchedModel();
+ return EnableSchedModel && SchedModel.hasInstrSchedModel() &&
+ !DisableItinerariesAndSchedModel;
}
bool TargetSchedModel::hasInstrItineraries() const {
- return EnableSchedItins && !InstrItins.isEmpty();
+ return EnableSchedItins && !InstrItins.isEmpty() &&
+ !DisableItinerariesAndSchedModel;
}
-void TargetSchedModel::init(const TargetSubtargetInfo *TSInfo) {
+void TargetSchedModel::init(const TargetSubtargetInfo *TSInfo, bool Disable) {
STI = TSInfo;
SchedModel = TSInfo->getSchedModel();
TII = TSInfo->getInstrInfo();
STI->initInstrItins(InstrItins);
+ DisableItinerariesAndSchedModel = Disable;
+
unsigned NumRes = SchedModel.getNumProcResourceKinds();
ResourceFactors.resize(NumRes);
ResourceLCM = SchedModel.IssueWidth;
diff --git a/llvm/test/CodeGen/AMDGPU/mai-hazards-gfx942.mir b/llvm/test/CodeGen/AMDGPU/mai-hazards-gfx942.mir
index d029043f90a85..dc57d421ee03f 100644
--- a/llvm/test/CodeGen/AMDGPU/mai-hazards-gfx942.mir
+++ b/llvm/test/CodeGen/AMDGPU/mai-hazards-gfx942.mir
@@ -1,5 +1,6 @@
# RUN: llc -mtriple=amdgcn -mcpu=gfx942 -verify-machineinstrs -run-pass post-RA-hazard-rec %s -o - | FileCheck -check-prefixes=GCN,GFX942 %s
# RUN: llc -mtriple=amdgcn -mcpu=gfx950 -verify-machineinstrs -run-pass post-RA-hazard-rec %s -o - | FileCheck -check-prefixes=GCN,GFX950 %s
+# RUN: llc -mtriple=amdgcn -mcpu=gfx950 -verify-machineinstrs -run-pass post-RA-hazard-rec --disable-schedmodel-in-sched-mi=1 %s -o - | FileCheck -check-prefixes=GCN,GFX950 %s
# GCN-LABEL: name: valu_write_vgpr_sgemm_mfma_read
# GCN: V_MOV_B32
diff --git a/llvm/test/CodeGen/AMDGPU/sched-no-schedmodel.mir b/llvm/test/CodeGen/AMDGPU/sched-no-schedmodel.mir
new file mode 100644
index 0000000000000..685b20ddd1156
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/sched-no-schedmodel.mir
@@ -0,0 +1,50 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx942 -misched-cluster=false --misched-prera-direction=topdown -run-pass=machine-scheduler --disable-schedmodel-in-sched-mi=0 -o - %s | FileCheck -check-prefix=GCN %s
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx942 -misched-cluster=false --misched-prera-direction=topdown -run-pass=machine-scheduler --disable-schedmodel-in-sched-mi=1 -o - %s | FileCheck -check-prefix=GCN-NO-SCHEDMODEL %s
+
+---
+name: sched_group_barrier_1_VMEM_READ_1_VALU_5_MFMA_1_VMEM_READ_3_VALU_2_VMEM_WRITE
+tracksRegLiveness: true
+body: |
+ bb.0:
+
+ ; GCN-LABEL: name: sched_group_barrier_1_VMEM_READ_1_VALU_5_MFMA_1_VMEM_READ_3_VALU_2_VMEM_WRITE
+ ; GCN: [[DEF:%[0-9]+]]:vreg_128_align2 = IMPLICIT_DEF
+ ; GCN-NEXT: [[DEF1:%[0-9]+]]:vreg_128_align2 = IMPLICIT_DEF
+ ; GCN-NEXT: early-clobber %2:vreg_512_align2 = contract V_MFMA_F32_32X32X16_FP8_FP8_vgprcd_e64 [[DEF]].sub0_sub1, [[DEF1]].sub0_sub1, 0, 0, 0, 0, implicit $mode, implicit $exec
+ ; GCN-NEXT: [[DEF2:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF
+ ; GCN-NEXT: dead [[DS_READ_U16_gfx9_:%[0-9]+]]:vgpr_32 = DS_READ_U16_gfx9 [[DEF2]], 0, 0, implicit $exec
+ ; GCN-NEXT: [[DEF3:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF
+ ; GCN-NEXT: dead [[DS_READ_U16_gfx9_1:%[0-9]+]]:vgpr_32 = DS_READ_U16_gfx9 [[DEF3]], 0, 0, implicit $exec
+ ; GCN-NEXT: [[DEF4:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF
+ ; GCN-NEXT: dead [[DS_READ_U16_gfx9_2:%[0-9]+]]:vgpr_32 = DS_READ_U16_gfx9 [[DEF4]], 0, 0, implicit $exec
+ ; GCN-NEXT: [[V_MUL_LO_U32_e64_:%[0-9]+]]:vgpr_32 = nsw V_MUL_LO_U32_e64 %2.sub0, %2.sub1, implicit $exec
+ ; GCN-NEXT: early-clobber %3:vreg_512_align2 = contract V_MFMA_F32_32X32X16_FP8_FP8_vgprcd_e64 [[DEF]].sub0_sub1, [[DEF1]].sub0_sub1, 0, 0, 0, 0, implicit $mode, implicit $exec
+ ; GCN-NEXT: S_ENDPGM 0, implicit %2, implicit %3, implicit [[V_MUL_LO_U32_e64_]]
+ ;
+ ; GCN-NO-SCHEDMODEL-LABEL: name: sched_group_barrier_1_VMEM_READ_1_VALU_5_MFMA_1_VMEM_READ_3_VALU_2_VMEM_WRITE
+ ; GCN-NO-SCHEDMODEL: [[DEF:%[0-9]+]]:vreg_128_align2 = IMPLICIT_DEF
+ ; GCN-NO-SCHEDMODEL-NEXT: [[DEF1:%[0-9]+]]:vreg_128_align2 = IMPLICIT_DEF
+ ; GCN-NO-SCHEDMODEL-NEXT: early-clobber %2:vreg_512_align2 = contract V_MFMA_F32_32X32X16_FP8_FP8_vgprcd_e64 [[DEF]].sub0_sub1, [[DEF1]].sub0_sub1, 0, 0, 0, 0, implicit $mode, implicit $exec
+ ; GCN-NO-SCHEDMODEL-NEXT: early-clobber %3:vreg_512_align2 = contract V_MFMA_F32_32X32X16_FP8_FP8_vgprcd_e64 [[DEF]].sub0_sub1, [[DEF1]].sub0_sub1, 0, 0, 0, 0, implicit $mode, implicit $exec
+ ; GCN-NO-SCHEDMODEL-NEXT: [[V_MUL_LO_U32_e64_:%[0-9]+]]:vgpr_32 = nsw V_MUL_LO_U32_e64 %2.sub0, %2.sub1, implicit $exec
+ ; GCN-NO-SCHEDMODEL-NEXT: [[DEF2:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF
+ ; GCN-NO-SCHEDMODEL-NEXT: dead [[DS_READ_U16_gfx9_:%[0-9]+]]:vgpr_32 = DS_READ_U16_gfx9 [[DEF2]], 0, 0, implicit $exec
+ ; GCN-NO-SCHEDMODEL-NEXT: [[DEF3:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF
+ ; GCN-NO-SCHEDMODEL-NEXT: dead [[DS_READ_U16_gfx9_1:%[0-9]+]]:vgpr_32 = DS_READ_U16_gfx9 [[DEF3]], 0, 0, implicit $exec
+ ; GCN-NO-SCHEDMODEL-NEXT: [[DEF4:%[0-9]+]]:vgpr_32 = IMPLICIT_DEF
+ ; GCN-NO-SCHEDMODEL-NEXT: dead [[DS_READ_U16_gfx9_2:%[0-9]+]]:vgpr_32 = DS_READ_U16_gfx9 [[DEF4]], 0, 0, implicit $exec
+ ; GCN-NO-SCHEDMODEL-NEXT: S_ENDPGM 0, implicit %2, implicit %3, implicit [[V_MUL_LO_U32_e64_]]
+ %0:vreg_128_align2 = IMPLICIT_DEF
+ %1:vreg_128_align2 = IMPLICIT_DEF
+ %2:vreg_512_align2 = contract V_MFMA_F32_32X32X16_FP8_FP8_vgprcd_e64 %0.sub0_sub1:vreg_128_align2, %1.sub0_sub1:vreg_128_align2, 0, 0, 0, 0, implicit $mode, implicit $exec
+ %3:vreg_512_align2 = contract V_MFMA_F32_32X32X16_FP8_FP8_vgprcd_e64 %0.sub0_sub1:vreg_128_align2, %1.sub0_sub1:vreg_128_align2, 0, 0, 0, 0, implicit $mode, implicit $exec
+ %4:vgpr_32 = nsw V_MUL_LO_U32_e64 %2.sub0, %2.sub1, implicit $exec
+ %5:vgpr_32 = IMPLICIT_DEF
+ %6:vgpr_32 = DS_READ_U16_gfx9 %5, 0, 0, implicit $exec
+ %7:vgpr_32 = IMPLICIT_DEF
+ %8:vgpr_32 = DS_READ_U16_gfx9 %7, 0, 0, implicit $exec
+ %9:vgpr_32 = IMPLICIT_DEF
+ %10:vgpr_32 = DS_READ_U16_gfx9 %9, 0, 0, implicit $exec
+ S_ENDPGM 0, implicit %2, implicit %3, implicit %4
+...
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know why you would want to only disable this during a specific pass. In general we have too many of these old debug flags used for bringup that have no real use case
Change-Id: I2c7080bce7fadbb7b6c471457edbc0606c1b0bb0
Certain passes may need this info for correctness. I've ported the existing flags into the Scheduler s.t. using them only works on scheduling pass |
✅ With the latest revision this PR passed the C/C++ code formatter. |
Change-Id: Ied902da014ca3dff4fc47f2a0871523b0dcd97da
Only user of this flag that I see is in |
I'd be much more in favour of getting rid of these kludge flags once and for all tbh. |
static cl::opt<bool> | ||
EnableSchedModel("schedmodel", cl::Hidden, cl::init(true), | ||
cl::desc("Use TargetSchedModel for latency lookup")); | ||
|
||
static cl::opt<bool> | ||
EnableSchedItins("scheditins", cl::Hidden, cl::init(true), | ||
cl::desc("Use InstrItineraryData for latency lookup")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are mutually exclusive though? What happens if you set both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell -- I don't see this mutual exclusion constraint encoded. It seems like the API handles the case of having neither --
Default if both are missing --
if (!hasInstrSchedModel() && !hasInstrItineraries()) |
Default if both are missing --
return TII->defaultDefLatency(SchedModel, *MI); |
bool EnableSchedModel = true; | ||
bool EnableSchedItins = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document these, maybe should just make it an enum for which type to use
Change-Id: I132cdb3b5709ac84ae858fa1aecee399abcec63f
Presently I find that they are a good way to experiment with latency / hazard agnostic scheduling. But I don't disagree with you: I think the original intent of these flags was to prefer Itins over SchedModel or vice-versa -- for that purpose, I agree that we shouldn't have these flags. |
…eduling (llvm#138057) This provides the `disable-schedmodel-in-sched-mi` flag. Using this, we will disable the SchedModel / Itineraries during scheduling. This has the effect of not using any latency / hardware resource information for scheduling decisions. We have the `schedmodel` flag, but this disables the `SchedModel` for all passes. This allows disabling only for scheduling while preserving the behavior of other passes (e.g. MachineLICM). This is conceptually similar to other flags like `enable-aa-sched-mi`
This provides the
disable-schedmodel-in-sched-mi
flag. Using this, we will disable the SchedModel / Itineraries during scheduling. This has the effect of not using any latency / hardware resource information for scheduling decisions.We have the
schedmodel
flag, but this disables theSchedModel
for all passes. This allows disabling only for scheduling while preserving the behavior of other passes (e.g. MachineLICM). This is conceptually similar to other flags likeenable-aa-sched-mi