Thanks to visit codestin.com
Credit goes to llvm.org

LLVM 22.0.0git
PeepholeOptimizer.cpp
Go to the documentation of this file.
1//===- PeepholeOptimizer.cpp - Peephole Optimizations ---------------------===//
2//
3// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4// See https://llvm.org/LICENSE.txt for license information.
5// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6//
7//===----------------------------------------------------------------------===//
8//
9// Perform peephole optimizations on the machine code:
10//
11// - Optimize Extensions
12//
13// Optimization of sign / zero extension instructions. It may be extended to
14// handle other instructions with similar properties.
15//
16// On some targets, some instructions, e.g. X86 sign / zero extension, may
17// leave the source value in the lower part of the result. This optimization
18// will replace some uses of the pre-extension value with uses of the
19// sub-register of the results.
20//
21// - Optimize Comparisons
22//
23// Optimization of comparison instructions. For instance, in this code:
24//
25// sub r1, 1
26// cmp r1, 0
27// bz L1
28//
29// If the "sub" instruction all ready sets (or could be modified to set) the
30// same flag that the "cmp" instruction sets and that "bz" uses, then we can
31// eliminate the "cmp" instruction.
32//
33// Another instance, in this code:
34//
35// sub r1, r3 | sub r1, imm
36// cmp r3, r1 or cmp r1, r3 | cmp r1, imm
37// bge L1
38//
39// If the branch instruction can use flag from "sub", then we can replace
40// "sub" with "subs" and eliminate the "cmp" instruction.
41//
42// - Optimize Loads:
43//
44// Loads that can be folded into a later instruction. A load is foldable
45// if it loads to virtual registers and the virtual register defined has
46// a single use.
47//
48// - Optimize Copies and Bitcast (more generally, target specific copies):
49//
50// Rewrite copies and bitcasts to avoid cross register bank copies
51// when possible.
52// E.g., Consider the following example, where capital and lower
53// letters denote different register file:
54// b = copy A <-- cross-bank copy
55// C = copy b <-- cross-bank copy
56// =>
57// b = copy A <-- cross-bank copy
58// C = copy A <-- same-bank copy
59//
60// E.g., for bitcast:
61// b = bitcast A <-- cross-bank copy
62// C = bitcast b <-- cross-bank copy
63// =>
64// b = bitcast A <-- cross-bank copy
65// C = copy A <-- same-bank copy
66//===----------------------------------------------------------------------===//
67
69#include "llvm/ADT/DenseMap.h"
71#include "llvm/ADT/SmallSet.h"
73#include "llvm/ADT/Statistic.h"
89#include "llvm/MC/LaneBitmask.h"
90#include "llvm/MC/MCInstrDesc.h"
91#include "llvm/Pass.h"
93#include "llvm/Support/Debug.h"
95#include <cassert>
96#include <cstdint>
97#include <utility>
98
99using namespace llvm;
102
103#define DEBUG_TYPE "peephole-opt"
104
105// Optimize Extensions
106static cl::opt<bool> Aggressive("aggressive-ext-opt", cl::Hidden,
107 cl::desc("Aggressive extension optimization"));
108
109static cl::opt<bool>
110 DisablePeephole("disable-peephole", cl::Hidden, cl::init(false),
111 cl::desc("Disable the peephole optimizer"));
112
113/// Specifiy whether or not the value tracking looks through
114/// complex instructions. When this is true, the value tracker
115/// bails on everything that is not a copy or a bitcast.
116static cl::opt<bool>
117 DisableAdvCopyOpt("disable-adv-copy-opt", cl::Hidden, cl::init(false),
118 cl::desc("Disable advanced copy optimization"));
119
121 "disable-non-allocatable-phys-copy-opt", cl::Hidden, cl::init(false),
122 cl::desc("Disable non-allocatable physical register copy optimization"));
123
124// Limit the number of PHI instructions to process
125// in PeepholeOptimizer::getNextSource.
127 RewritePHILimit("rewrite-phi-limit", cl::Hidden, cl::init(10),
128 cl::desc("Limit the length of PHI chains to lookup"));
129
130// Limit the length of recurrence chain when evaluating the benefit of
131// commuting operands.
133 "recurrence-chain-limit", cl::Hidden, cl::init(3),
134 cl::desc("Maximum length of recurrence chain when evaluating the benefit "
135 "of commuting operands"));
136
137STATISTIC(NumReuse, "Number of extension results reused");
138STATISTIC(NumCmps, "Number of compares eliminated");
139STATISTIC(NumImmFold, "Number of move immediate folded");
140STATISTIC(NumLoadFold, "Number of loads folded");
141STATISTIC(NumSelects, "Number of selects optimized");
142STATISTIC(NumUncoalescableCopies, "Number of uncoalescable copies optimized");
143STATISTIC(NumRewrittenCopies, "Number of copies rewritten");
144STATISTIC(NumNAPhysCopies, "Number of non-allocatable physical copies removed");
145
146namespace {
147
148class ValueTrackerResult;
149class RecurrenceInstr;
150
151/// Interface to query instructions amenable to copy rewriting.
152class Rewriter {
153protected:
154 MachineInstr &CopyLike;
155 int CurrentSrcIdx = 0; ///< The index of the source being rewritten.
156public:
157 Rewriter(MachineInstr &CopyLike) : CopyLike(CopyLike) {}
158 virtual ~Rewriter() = default;
159
160 /// Get the next rewritable source (SrcReg, SrcSubReg) and
161 /// the related value that it affects (DstReg, DstSubReg).
162 /// A source is considered rewritable if its register class and the
163 /// register class of the related DstReg may not be register
164 /// coalescer friendly. In other words, given a copy-like instruction
165 /// not all the arguments may be returned at rewritable source, since
166 /// some arguments are none to be register coalescer friendly.
167 ///
168 /// Each call of this method moves the current source to the next
169 /// rewritable source.
170 /// For instance, let CopyLike be the instruction to rewrite.
171 /// CopyLike has one definition and one source:
172 /// dst.dstSubIdx = CopyLike src.srcSubIdx.
173 ///
174 /// The first call will give the first rewritable source, i.e.,
175 /// the only source this instruction has:
176 /// (SrcReg, SrcSubReg) = (src, srcSubIdx).
177 /// This source defines the whole definition, i.e.,
178 /// (DstReg, DstSubReg) = (dst, dstSubIdx).
179 ///
180 /// The second and subsequent calls will return false, as there is only one
181 /// rewritable source.
182 ///
183 /// \return True if a rewritable source has been found, false otherwise.
184 /// The output arguments are valid if and only if true is returned.
185 virtual bool getNextRewritableSource(RegSubRegPair &Src,
186 RegSubRegPair &Dst) = 0;
187
188 /// Rewrite the current source with \p NewReg and \p NewSubReg if possible.
189 /// \return True if the rewriting was possible, false otherwise.
190 virtual bool RewriteCurrentSource(Register NewReg, unsigned NewSubReg) = 0;
191};
192
193/// Rewriter for COPY instructions.
194class CopyRewriter : public Rewriter {
195public:
196 CopyRewriter(MachineInstr &MI) : Rewriter(MI) {
197 assert(MI.isCopy() && "Expected copy instruction");
198 }
199 virtual ~CopyRewriter() = default;
200
201 bool getNextRewritableSource(RegSubRegPair &Src,
202 RegSubRegPair &Dst) override {
203 if (++CurrentSrcIdx > 1)
204 return false;
205
206 // The rewritable source is the argument.
207 const MachineOperand &MOSrc = CopyLike.getOperand(CurrentSrcIdx);
208 Src = RegSubRegPair(MOSrc.getReg(), MOSrc.getSubReg());
209 // What we track are the alternative sources of the definition.
210 const MachineOperand &MODef = CopyLike.getOperand(0);
211 Dst = RegSubRegPair(MODef.getReg(), MODef.getSubReg());
212 return true;
213 }
214
215 bool RewriteCurrentSource(Register NewReg, unsigned NewSubReg) override {
216 MachineOperand &MOSrc = CopyLike.getOperand(CurrentSrcIdx);
217 MOSrc.setReg(NewReg);
218 MOSrc.setSubReg(NewSubReg);
219 return true;
220 }
221};
222
223/// Helper class to rewrite uncoalescable copy like instructions
224/// into new COPY (coalescable friendly) instructions.
225class UncoalescableRewriter : public Rewriter {
226 int NumDefs; ///< Number of defs in the bitcast.
227
228public:
229 UncoalescableRewriter(MachineInstr &MI) : Rewriter(MI) {
230 NumDefs = MI.getDesc().getNumDefs();
231 }
232
233 /// \see See Rewriter::getNextRewritableSource()
234 /// All such sources need to be considered rewritable in order to
235 /// rewrite a uncoalescable copy-like instruction. This method return
236 /// each definition that must be checked if rewritable.
237 bool getNextRewritableSource(RegSubRegPair &Src,
238 RegSubRegPair &Dst) override {
239 // Find the next non-dead definition and continue from there.
240 if (CurrentSrcIdx == NumDefs)
241 return false;
242
243 while (CopyLike.getOperand(CurrentSrcIdx).isDead()) {
244 ++CurrentSrcIdx;
245 if (CurrentSrcIdx == NumDefs)
246 return false;
247 }
248
249 // What we track are the alternative sources of the definition.
250 Src = RegSubRegPair(0, 0);
251 const MachineOperand &MODef = CopyLike.getOperand(CurrentSrcIdx);
252 Dst = RegSubRegPair(MODef.getReg(), MODef.getSubReg());
253
254 CurrentSrcIdx++;
255 return true;
256 }
257
258 bool RewriteCurrentSource(Register NewReg, unsigned NewSubReg) override {
259 return false;
260 }
261};
262
263/// Specialized rewriter for INSERT_SUBREG instruction.
264class InsertSubregRewriter : public Rewriter {
265public:
266 InsertSubregRewriter(MachineInstr &MI) : Rewriter(MI) {
267 assert(MI.isInsertSubreg() && "Invalid instruction");
268 }
269
270 /// \see See Rewriter::getNextRewritableSource()
271 /// Here CopyLike has the following form:
272 /// dst = INSERT_SUBREG Src1, Src2.src2SubIdx, subIdx.
273 /// Src1 has the same register class has dst, hence, there is
274 /// nothing to rewrite.
275 /// Src2.src2SubIdx, may not be register coalescer friendly.
276 /// Therefore, the first call to this method returns:
277 /// (SrcReg, SrcSubReg) = (Src2, src2SubIdx).
278 /// (DstReg, DstSubReg) = (dst, subIdx).
279 ///
280 /// Subsequence calls will return false.
281 bool getNextRewritableSource(RegSubRegPair &Src,
282 RegSubRegPair &Dst) override {
283 // If we already get the only source we can rewrite, return false.
284 if (CurrentSrcIdx == 2)
285 return false;
286 // We are looking at v2 = INSERT_SUBREG v0, v1, sub0.
287 CurrentSrcIdx = 2;
288 const MachineOperand &MOInsertedReg = CopyLike.getOperand(2);
289 Src = RegSubRegPair(MOInsertedReg.getReg(), MOInsertedReg.getSubReg());
290 const MachineOperand &MODef = CopyLike.getOperand(0);
291
292 // We want to track something that is compatible with the
293 // partial definition.
294 if (MODef.getSubReg())
295 // Bail if we have to compose sub-register indices.
296 return false;
297 Dst = RegSubRegPair(MODef.getReg(),
298 (unsigned)CopyLike.getOperand(3).getImm());
299 return true;
300 }
301
302 bool RewriteCurrentSource(Register NewReg, unsigned NewSubReg) override {
303 if (CurrentSrcIdx != 2)
304 return false;
305 // We are rewriting the inserted reg.
306 MachineOperand &MO = CopyLike.getOperand(CurrentSrcIdx);
307 MO.setReg(NewReg);
308 MO.setSubReg(NewSubReg);
309 return true;
310 }
311};
312
313/// Specialized rewriter for EXTRACT_SUBREG instruction.
314class ExtractSubregRewriter : public Rewriter {
315 const TargetInstrInfo &TII;
316
317public:
318 ExtractSubregRewriter(MachineInstr &MI, const TargetInstrInfo &TII)
319 : Rewriter(MI), TII(TII) {
320 assert(MI.isExtractSubreg() && "Invalid instruction");
321 }
322
323 /// \see Rewriter::getNextRewritableSource()
324 /// Here CopyLike has the following form:
325 /// dst.dstSubIdx = EXTRACT_SUBREG Src, subIdx.
326 /// There is only one rewritable source: Src.subIdx,
327 /// which defines dst.dstSubIdx.
328 bool getNextRewritableSource(RegSubRegPair &Src,
329 RegSubRegPair &Dst) override {
330 // If we already get the only source we can rewrite, return false.
331 if (CurrentSrcIdx == 1)
332 return false;
333 // We are looking at v1 = EXTRACT_SUBREG v0, sub0.
334 CurrentSrcIdx = 1;
335 const MachineOperand &MOExtractedReg = CopyLike.getOperand(1);
336 // If we have to compose sub-register indices, bail out.
337 if (MOExtractedReg.getSubReg())
338 return false;
339
340 Src =
341 RegSubRegPair(MOExtractedReg.getReg(), CopyLike.getOperand(2).getImm());
342
343 // We want to track something that is compatible with the definition.
344 const MachineOperand &MODef = CopyLike.getOperand(0);
345 Dst = RegSubRegPair(MODef.getReg(), MODef.getSubReg());
346 return true;
347 }
348
349 bool RewriteCurrentSource(Register NewReg, unsigned NewSubReg) override {
350 // The only source we can rewrite is the input register.
351 if (CurrentSrcIdx != 1)
352 return false;
353
354 CopyLike.getOperand(CurrentSrcIdx).setReg(NewReg);
355
356 // If we find a source that does not require to extract something,
357 // rewrite the operation with a copy.
358 if (!NewSubReg) {
359 // Move the current index to an invalid position.
360 // We do not want another call to this method to be able
361 // to do any change.
362 CurrentSrcIdx = -1;
363 // Rewrite the operation as a COPY.
364 // Get rid of the sub-register index.
365 CopyLike.removeOperand(2);
366 // Morph the operation into a COPY.
367 CopyLike.setDesc(TII.get(TargetOpcode::COPY));
368 return true;
369 }
370 CopyLike.getOperand(CurrentSrcIdx + 1).setImm(NewSubReg);
371 return true;
372 }
373};
374
375/// Specialized rewriter for REG_SEQUENCE instruction.
376class RegSequenceRewriter : public Rewriter {
377public:
378 RegSequenceRewriter(MachineInstr &MI) : Rewriter(MI) {
379 assert(MI.isRegSequence() && "Invalid instruction");
380 CurrentSrcIdx = -1;
381 }
382
383 /// \see Rewriter::getNextRewritableSource()
384 /// Here CopyLike has the following form:
385 /// dst = REG_SEQUENCE Src1.src1SubIdx, subIdx1, Src2.src2SubIdx, subIdx2.
386 /// Each call will return a different source, walking all the available
387 /// source.
388 ///
389 /// The first call returns:
390 /// (SrcReg, SrcSubReg) = (Src1, src1SubIdx).
391 /// (DstReg, DstSubReg) = (dst, subIdx1).
392 ///
393 /// The second call returns:
394 /// (SrcReg, SrcSubReg) = (Src2, src2SubIdx).
395 /// (DstReg, DstSubReg) = (dst, subIdx2).
396 ///
397 /// And so on, until all the sources have been traversed, then
398 /// it returns false.
399 bool getNextRewritableSource(RegSubRegPair &Src,
400 RegSubRegPair &Dst) override {
401 // We are looking at v0 = REG_SEQUENCE v1, sub1, v2, sub2, etc.
402 CurrentSrcIdx += 2;
403 if (static_cast<unsigned>(CurrentSrcIdx) >= CopyLike.getNumOperands())
404 return false;
405
406 const MachineOperand &MOInsertedReg = CopyLike.getOperand(CurrentSrcIdx);
407 Src.Reg = MOInsertedReg.getReg();
408 Src.SubReg = MOInsertedReg.getSubReg();
409
410 // We want to track something that is compatible with the related
411 // partial definition.
412 Dst.SubReg = CopyLike.getOperand(CurrentSrcIdx + 1).getImm();
413
414 const MachineOperand &MODef = CopyLike.getOperand(0);
415 Dst.Reg = MODef.getReg();
416 assert(MODef.getSubReg() == 0 && "cannot have subregister def in SSA");
417 return true;
418 }
419
420 bool RewriteCurrentSource(Register NewReg, unsigned NewSubReg) override {
421 MachineOperand &MO = CopyLike.getOperand(CurrentSrcIdx);
422 MO.setReg(NewReg);
423 MO.setSubReg(NewSubReg);
424 return true;
425 }
426};
427
428class PeepholeOptimizer : private MachineFunction::Delegate {
429 const TargetInstrInfo *TII = nullptr;
430 const TargetRegisterInfo *TRI = nullptr;
431 MachineRegisterInfo *MRI = nullptr;
432 MachineDominatorTree *DT = nullptr; // Machine dominator tree
433 MachineLoopInfo *MLI = nullptr;
434
435public:
436 PeepholeOptimizer(MachineDominatorTree *DT, MachineLoopInfo *MLI)
437 : DT(DT), MLI(MLI) {}
438
439 bool run(MachineFunction &MF);
440 /// Track Def -> Use info used for rewriting copies.
441 using RewriteMapTy = SmallDenseMap<RegSubRegPair, ValueTrackerResult>;
442
443 /// Sequence of instructions that formulate recurrence cycle.
444 using RecurrenceCycle = SmallVector<RecurrenceInstr, 4>;
445
446private:
447 bool optimizeCmpInstr(MachineInstr &MI);
448 bool optimizeExtInstr(MachineInstr &MI, MachineBasicBlock &MBB,
449 SmallPtrSetImpl<MachineInstr *> &LocalMIs);
450 bool optimizeSelect(MachineInstr &MI,
451 SmallPtrSetImpl<MachineInstr *> &LocalMIs);
452 bool optimizeCondBranch(MachineInstr &MI);
453
454 bool optimizeCoalescableCopyImpl(Rewriter &&CpyRewriter);
455 bool optimizeCoalescableCopy(MachineInstr &MI);
456 bool optimizeUncoalescableCopy(MachineInstr &MI,
457 SmallPtrSetImpl<MachineInstr *> &LocalMIs);
458 bool optimizeRecurrence(MachineInstr &PHI);
459 bool findNextSource(const TargetRegisterClass *DefRC, unsigned DefSubReg,
460 RegSubRegPair RegSubReg, RewriteMapTy &RewriteMap);
461 bool isMoveImmediate(MachineInstr &MI, SmallSet<Register, 4> &ImmDefRegs,
462 DenseMap<Register, MachineInstr *> &ImmDefMIs);
463 bool foldImmediate(MachineInstr &MI, SmallSet<Register, 4> &ImmDefRegs,
464 DenseMap<Register, MachineInstr *> &ImmDefMIs,
465 bool &Deleted);
466
467 /// Finds recurrence cycles, but only ones that formulated around
468 /// a def operand and a use operand that are tied. If there is a use
469 /// operand commutable with the tied use operand, find recurrence cycle
470 /// along that operand as well.
471 bool findTargetRecurrence(Register Reg,
472 const SmallSet<Register, 2> &TargetReg,
473 RecurrenceCycle &RC);
474
475 /// If copy instruction \p MI is a virtual register copy or a copy of a
476 /// constant physical register to a virtual register, track it in the
477 /// set CopySrcMIs. If this virtual register was previously seen as a
478 /// copy, replace the uses of this copy with the previously seen copy's
479 /// destination register.
480 bool foldRedundantCopy(MachineInstr &MI);
481
482 /// Is the register \p Reg a non-allocatable physical register?
483 bool isNAPhysCopy(Register Reg);
484
485 /// If copy instruction \p MI is a non-allocatable virtual<->physical
486 /// register copy, track it in the \p NAPhysToVirtMIs map. If this
487 /// non-allocatable physical register was previously copied to a virtual
488 /// registered and hasn't been clobbered, the virt->phys copy can be
489 /// deleted.
490 bool
491 foldRedundantNAPhysCopy(MachineInstr &MI,
492 DenseMap<Register, MachineInstr *> &NAPhysToVirtMIs);
493
494 bool isLoadFoldable(MachineInstr &MI,
495 SmallSet<Register, 16> &FoldAsLoadDefCandidates);
496
497 /// Check whether \p MI is understood by the register coalescer
498 /// but may require some rewriting.
499 static bool isCoalescableCopy(const MachineInstr &MI) {
500 // SubregToRegs are not interesting, because they are already register
501 // coalescer friendly.
502 return MI.isCopy() ||
503 (!DisableAdvCopyOpt && (MI.isRegSequence() || MI.isInsertSubreg() ||
504 MI.isExtractSubreg()));
505 }
506
507 /// Check whether \p MI is a copy like instruction that is
508 /// not recognized by the register coalescer.
509 static bool isUncoalescableCopy(const MachineInstr &MI) {
510 return MI.isBitcast() || (!DisableAdvCopyOpt && (MI.isRegSequenceLike() ||
511 MI.isInsertSubregLike() ||
512 MI.isExtractSubregLike()));
513 }
514
515 MachineInstr &rewriteSource(MachineInstr &CopyLike, RegSubRegPair Def,
516 RewriteMapTy &RewriteMap);
517
518 // Set of copies to virtual registers keyed by source register. Never
519 // holds any physreg which requires def tracking.
520 DenseMap<RegSubRegPair, MachineInstr *> CopySrcMIs;
521
522 // MachineFunction::Delegate implementation. Used to maintain CopySrcMIs.
523 void MF_HandleInsertion(MachineInstr &MI) override {}
524
525 bool getCopySrc(MachineInstr &MI, RegSubRegPair &SrcPair) {
526 if (!MI.isCopy())
527 return false;
528
529 Register SrcReg = MI.getOperand(1).getReg();
530 unsigned SrcSubReg = MI.getOperand(1).getSubReg();
531 if (!SrcReg.isVirtual() && !MRI->isConstantPhysReg(SrcReg))
532 return false;
533
534 SrcPair = RegSubRegPair(SrcReg, SrcSubReg);
535 return true;
536 }
537
538 // If a COPY instruction is to be deleted or changed, we should also remove
539 // it from CopySrcMIs.
540 void deleteChangedCopy(MachineInstr &MI) {
541 RegSubRegPair SrcPair;
542 if (!getCopySrc(MI, SrcPair))
543 return;
544
545 auto It = CopySrcMIs.find(SrcPair);
546 if (It != CopySrcMIs.end() && It->second == &MI)
547 CopySrcMIs.erase(It);
548 }
549
550 void MF_HandleRemoval(MachineInstr &MI) override { deleteChangedCopy(MI); }
551
552 void MF_HandleChangeDesc(MachineInstr &MI, const MCInstrDesc &TID) override {
553 deleteChangedCopy(MI);
554 }
555};
556
557class PeepholeOptimizerLegacy : public MachineFunctionPass {
558public:
559 static char ID; // Pass identification
560
561 PeepholeOptimizerLegacy() : MachineFunctionPass(ID) {
563 }
564
565 bool runOnMachineFunction(MachineFunction &MF) override;
566
567 void getAnalysisUsage(AnalysisUsage &AU) const override {
568 AU.setPreservesCFG();
570 AU.addRequired<MachineLoopInfoWrapperPass>();
571 AU.addPreserved<MachineLoopInfoWrapperPass>();
572 if (Aggressive) {
573 AU.addRequired<MachineDominatorTreeWrapperPass>();
574 AU.addPreserved<MachineDominatorTreeWrapperPass>();
575 }
576 }
577
578 MachineFunctionProperties getRequiredProperties() const override {
579 return MachineFunctionProperties().setIsSSA();
580 }
581};
582
583/// Helper class to hold instructions that are inside recurrence cycles.
584/// The recurrence cycle is formulated around 1) a def operand and its
585/// tied use operand, or 2) a def operand and a use operand that is commutable
586/// with another use operand which is tied to the def operand. In the latter
587/// case, index of the tied use operand and the commutable use operand are
588/// maintained with CommutePair.
589class RecurrenceInstr {
590public:
591 using IndexPair = std::pair<unsigned, unsigned>;
592
593 RecurrenceInstr(MachineInstr *MI) : MI(MI) {}
594 RecurrenceInstr(MachineInstr *MI, unsigned Idx1, unsigned Idx2)
595 : MI(MI), CommutePair(std::make_pair(Idx1, Idx2)) {}
596
597 MachineInstr *getMI() const { return MI; }
598 std::optional<IndexPair> getCommutePair() const { return CommutePair; }
599
600private:
601 MachineInstr *MI;
602 std::optional<IndexPair> CommutePair;
603};
604
605/// Helper class to hold a reply for ValueTracker queries.
606/// Contains the returned sources for a given search and the instructions
607/// where the sources were tracked from.
608class ValueTrackerResult {
609private:
610 /// Track all sources found by one ValueTracker query.
612
613 /// Instruction using the sources in 'RegSrcs'.
614 const MachineInstr *Inst = nullptr;
615
616public:
617 ValueTrackerResult() = default;
618
619 ValueTrackerResult(Register Reg, unsigned SubReg) { addSource(Reg, SubReg); }
620
621 bool isValid() const { return getNumSources() > 0; }
622
623 void setInst(const MachineInstr *I) { Inst = I; }
624 const MachineInstr *getInst() const { return Inst; }
625
626 void clear() {
627 RegSrcs.clear();
628 Inst = nullptr;
629 }
630
631 void addSource(Register SrcReg, unsigned SrcSubReg) {
632 RegSrcs.push_back(RegSubRegPair(SrcReg, SrcSubReg));
633 }
634
635 void setSource(int Idx, Register SrcReg, unsigned SrcSubReg) {
636 assert(Idx < getNumSources() && "Reg pair source out of index");
637 RegSrcs[Idx] = RegSubRegPair(SrcReg, SrcSubReg);
638 }
639
640 int getNumSources() const { return RegSrcs.size(); }
641
642 RegSubRegPair getSrc(int Idx) const { return RegSrcs[Idx]; }
643
644 Register getSrcReg(int Idx) const {
645 assert(Idx < getNumSources() && "Reg source out of index");
646 return RegSrcs[Idx].Reg;
647 }
648
649 unsigned getSrcSubReg(int Idx) const {
650 assert(Idx < getNumSources() && "SubReg source out of index");
651 return RegSrcs[Idx].SubReg;
652 }
653
654 bool operator==(const ValueTrackerResult &Other) const {
655 if (Other.getInst() != getInst())
656 return false;
657
658 if (Other.getNumSources() != getNumSources())
659 return false;
660
661 for (int i = 0, e = Other.getNumSources(); i != e; ++i)
662 if (Other.getSrcReg(i) != getSrcReg(i) ||
663 Other.getSrcSubReg(i) != getSrcSubReg(i))
664 return false;
665 return true;
666 }
667};
668
669/// Helper class to track the possible sources of a value defined by
670/// a (chain of) copy related instructions.
671/// Given a definition (instruction and definition index), this class
672/// follows the use-def chain to find successive suitable sources.
673/// The given source can be used to rewrite the definition into
674/// def = COPY src.
675///
676/// For instance, let us consider the following snippet:
677/// v0 =
678/// v2 = INSERT_SUBREG v1, v0, sub0
679/// def = COPY v2.sub0
680///
681/// Using a ValueTracker for def = COPY v2.sub0 will give the following
682/// suitable sources:
683/// v2.sub0 and v0.
684/// Then, def can be rewritten into def = COPY v0.
685class ValueTracker {
686private:
687 /// The current point into the use-def chain.
688 const MachineInstr *Def = nullptr;
689
690 /// The index of the definition in Def.
691 unsigned DefIdx = 0;
692
693 /// The sub register index of the definition.
694 unsigned DefSubReg;
695
696 /// The register where the value can be found.
697 Register Reg;
698
699 /// MachineRegisterInfo used to perform tracking.
700 const MachineRegisterInfo &MRI;
701
702 /// Optional TargetInstrInfo used to perform some complex tracking.
703 const TargetInstrInfo *TII;
704
705 /// Dispatcher to the right underlying implementation of getNextSource.
706 ValueTrackerResult getNextSourceImpl();
707
708 /// Specialized version of getNextSource for Copy instructions.
709 ValueTrackerResult getNextSourceFromCopy();
710
711 /// Specialized version of getNextSource for Bitcast instructions.
712 ValueTrackerResult getNextSourceFromBitcast();
713
714 /// Specialized version of getNextSource for RegSequence instructions.
715 ValueTrackerResult getNextSourceFromRegSequence();
716
717 /// Specialized version of getNextSource for InsertSubreg instructions.
718 ValueTrackerResult getNextSourceFromInsertSubreg();
719
720 /// Specialized version of getNextSource for ExtractSubreg instructions.
721 ValueTrackerResult getNextSourceFromExtractSubreg();
722
723 /// Specialized version of getNextSource for SubregToReg instructions.
724 ValueTrackerResult getNextSourceFromSubregToReg();
725
726 /// Specialized version of getNextSource for PHI instructions.
727 ValueTrackerResult getNextSourceFromPHI();
728
729public:
730 /// Create a ValueTracker instance for the value defined by \p Reg.
731 /// \p DefSubReg represents the sub register index the value tracker will
732 /// track. It does not need to match the sub register index used in the
733 /// definition of \p Reg.
734 /// If \p Reg is a physical register, a value tracker constructed with
735 /// this constructor will not find any alternative source.
736 /// Indeed, when \p Reg is a physical register that constructor does not
737 /// know which definition of \p Reg it should track.
738 /// Use the next constructor to track a physical register.
739 ValueTracker(Register Reg, unsigned DefSubReg, const MachineRegisterInfo &MRI,
740 const TargetInstrInfo *TII = nullptr)
741 : DefSubReg(DefSubReg), Reg(Reg), MRI(MRI), TII(TII) {
742 if (!Reg.isPhysical()) {
743 Def = MRI.getVRegDef(Reg);
744 DefIdx = MRI.def_begin(Reg).getOperandNo();
745 }
746 }
747
748 /// Following the use-def chain, get the next available source
749 /// for the tracked value.
750 /// \return A ValueTrackerResult containing a set of registers
751 /// and sub registers with tracked values. A ValueTrackerResult with
752 /// an empty set of registers means no source was found.
753 ValueTrackerResult getNextSource();
754};
755
756} // end anonymous namespace
757
758char PeepholeOptimizerLegacy::ID = 0;
759
760char &llvm::PeepholeOptimizerLegacyID = PeepholeOptimizerLegacy::ID;
761
762INITIALIZE_PASS_BEGIN(PeepholeOptimizerLegacy, DEBUG_TYPE,
763 "Peephole Optimizations", false, false)
766INITIALIZE_PASS_END(PeepholeOptimizerLegacy, DEBUG_TYPE,
767 "Peephole Optimizations", false, false)
768
769/// If instruction is a copy-like instruction, i.e. it reads a single register
770/// and writes a single register and it does not modify the source, and if the
771/// source value is preserved as a sub-register of the result, then replace all
772/// reachable uses of the source with the subreg of the result.
773///
774/// Do not generate an EXTRACT that is used only in a debug use, as this changes
775/// the code. Since this code does not currently share EXTRACTs, just ignore all
776/// debug uses.
777bool PeepholeOptimizer::optimizeExtInstr(
779 SmallPtrSetImpl<MachineInstr *> &LocalMIs) {
780 Register SrcReg, DstReg;
781 unsigned SubIdx;
782 if (!TII->isCoalescableExtInstr(MI, SrcReg, DstReg, SubIdx))
783 return false;
784
785 if (DstReg.isPhysical() || SrcReg.isPhysical())
786 return false;
787
788 if (MRI->hasOneNonDBGUse(SrcReg))
789 // No other uses.
790 return false;
791
792 // Ensure DstReg can get a register class that actually supports
793 // sub-registers. Don't change the class until we commit.
794 const TargetRegisterClass *DstRC = MRI->getRegClass(DstReg);
795 DstRC = TRI->getSubClassWithSubReg(DstRC, SubIdx);
796 if (!DstRC)
797 return false;
798
799 // The ext instr may be operating on a sub-register of SrcReg as well.
800 // PPC::EXTSW is a 32 -> 64-bit sign extension, but it reads a 64-bit
801 // register.
802 // If UseSrcSubIdx is Set, SubIdx also applies to SrcReg, and only uses of
803 // SrcReg:SubIdx should be replaced.
804 bool UseSrcSubIdx =
805 TRI->getSubClassWithSubReg(MRI->getRegClass(SrcReg), SubIdx) != nullptr;
806
807 // The source has other uses. See if we can replace the other uses with use of
808 // the result of the extension.
810 for (MachineInstr &UI : MRI->use_nodbg_instructions(DstReg))
811 ReachedBBs.insert(UI.getParent());
812
813 // Uses that are in the same BB of uses of the result of the instruction.
815
816 // Uses that the result of the instruction can reach.
818
819 bool ExtendLife = true;
820 for (MachineOperand &UseMO : MRI->use_nodbg_operands(SrcReg)) {
821 MachineInstr *UseMI = UseMO.getParent();
822 if (UseMI == &MI)
823 continue;
824
825 if (UseMI->isPHI()) {
826 ExtendLife = false;
827 continue;
828 }
829
830 // Only accept uses of SrcReg:SubIdx.
831 if (UseSrcSubIdx && UseMO.getSubReg() != SubIdx)
832 continue;
833
834 // It's an error to translate this:
835 //
836 // %reg1025 = <sext> %reg1024
837 // ...
838 // %reg1026 = SUBREG_TO_REG 0, %reg1024, 4
839 //
840 // into this:
841 //
842 // %reg1025 = <sext> %reg1024
843 // ...
844 // %reg1027 = COPY %reg1025:4
845 // %reg1026 = SUBREG_TO_REG 0, %reg1027, 4
846 //
847 // The problem here is that SUBREG_TO_REG is there to assert that an
848 // implicit zext occurs. It doesn't insert a zext instruction. If we allow
849 // the COPY here, it will give us the value after the <sext>, not the
850 // original value of %reg1024 before <sext>.
851 if (UseMI->getOpcode() == TargetOpcode::SUBREG_TO_REG)
852 continue;
853
854 MachineBasicBlock *UseMBB = UseMI->getParent();
855 if (UseMBB == &MBB) {
856 // Local uses that come after the extension.
857 if (!LocalMIs.count(UseMI))
858 Uses.push_back(&UseMO);
859 } else if (ReachedBBs.count(UseMBB)) {
860 // Non-local uses where the result of the extension is used. Always
861 // replace these unless it's a PHI.
862 Uses.push_back(&UseMO);
863 } else if (Aggressive && DT->dominates(&MBB, UseMBB)) {
864 // We may want to extend the live range of the extension result in order
865 // to replace these uses.
866 ExtendedUses.push_back(&UseMO);
867 } else {
868 // Both will be live out of the def MBB anyway. Don't extend live range of
869 // the extension result.
870 ExtendLife = false;
871 break;
872 }
873 }
874
875 if (ExtendLife && !ExtendedUses.empty())
876 // Extend the liveness of the extension result.
877 Uses.append(ExtendedUses.begin(), ExtendedUses.end());
878
879 // Now replace all uses.
880 bool Changed = false;
881 if (!Uses.empty()) {
882 SmallPtrSet<MachineBasicBlock *, 4> PHIBBs;
883
884 // Look for PHI uses of the extended result, we don't want to extend the
885 // liveness of a PHI input. It breaks all kinds of assumptions down
886 // stream. A PHI use is expected to be the kill of its source values.
887 for (MachineInstr &UI : MRI->use_nodbg_instructions(DstReg))
888 if (UI.isPHI())
889 PHIBBs.insert(UI.getParent());
890
891 const TargetRegisterClass *RC = MRI->getRegClass(SrcReg);
892 for (MachineOperand *UseMO : Uses) {
893 MachineInstr *UseMI = UseMO->getParent();
894 MachineBasicBlock *UseMBB = UseMI->getParent();
895 if (PHIBBs.count(UseMBB))
896 continue;
897
898 // About to add uses of DstReg, clear DstReg's kill flags.
899 if (!Changed) {
900 MRI->clearKillFlags(DstReg);
901 MRI->constrainRegClass(DstReg, DstRC);
902 }
903
904 // SubReg defs are illegal in machine SSA phase,
905 // we should not generate SubReg defs.
906 //
907 // For example, for the instructions:
908 //
909 // %1:g8rc_and_g8rc_nox0 = EXTSW %0:g8rc
910 // %3:gprc_and_gprc_nor0 = COPY %0.sub_32:g8rc
911 //
912 // We should generate:
913 //
914 // %1:g8rc_and_g8rc_nox0 = EXTSW %0:g8rc
915 // %6:gprc_and_gprc_nor0 = COPY %1.sub_32:g8rc_and_g8rc_nox0
916 // %3:gprc_and_gprc_nor0 = COPY %6:gprc_and_gprc_nor0
917 //
918 if (UseSrcSubIdx)
919 RC = MRI->getRegClass(UseMI->getOperand(0).getReg());
920
921 Register NewVR = MRI->createVirtualRegister(RC);
922 BuildMI(*UseMBB, UseMI, UseMI->getDebugLoc(),
923 TII->get(TargetOpcode::COPY), NewVR)
924 .addReg(DstReg, 0, SubIdx);
925 if (UseSrcSubIdx)
926 UseMO->setSubReg(0);
927
928 UseMO->setReg(NewVR);
929 ++NumReuse;
930 Changed = true;
931 }
932 }
933
934 return Changed;
935}
936
937/// If the instruction is a compare and the previous instruction it's comparing
938/// against already sets (or could be modified to set) the same flag as the
939/// compare, then we can remove the comparison and use the flag from the
940/// previous instruction.
941bool PeepholeOptimizer::optimizeCmpInstr(MachineInstr &MI) {
942 // If this instruction is a comparison against zero and isn't comparing a
943 // physical register, we can try to optimize it.
944 Register SrcReg, SrcReg2;
945 int64_t CmpMask, CmpValue;
946 if (!TII->analyzeCompare(MI, SrcReg, SrcReg2, CmpMask, CmpValue) ||
947 SrcReg.isPhysical() || SrcReg2.isPhysical())
948 return false;
949
950 // Attempt to optimize the comparison instruction.
951 LLVM_DEBUG(dbgs() << "Attempting to optimize compare: " << MI);
952 if (TII->optimizeCompareInstr(MI, SrcReg, SrcReg2, CmpMask, CmpValue, MRI)) {
953 LLVM_DEBUG(dbgs() << " -> Successfully optimized compare!\n");
954 ++NumCmps;
955 return true;
956 }
957
958 return false;
959}
960
961/// Optimize a select instruction.
962bool PeepholeOptimizer::optimizeSelect(
963 MachineInstr &MI, SmallPtrSetImpl<MachineInstr *> &LocalMIs) {
964 unsigned TrueOp = 0;
965 unsigned FalseOp = 0;
966 bool Optimizable = false;
968 if (TII->analyzeSelect(MI, Cond, TrueOp, FalseOp, Optimizable))
969 return false;
970 if (!Optimizable)
971 return false;
972 if (!TII->optimizeSelect(MI, LocalMIs))
973 return false;
974 LLVM_DEBUG(dbgs() << "Deleting select: " << MI);
975 MI.eraseFromParent();
976 ++NumSelects;
977 return true;
978}
979
980/// Check if a simpler conditional branch can be generated.
981bool PeepholeOptimizer::optimizeCondBranch(MachineInstr &MI) {
982 return TII->optimizeCondBranch(MI);
983}
984
985/// Try to find a better source value that shares the same register file to
986/// replace \p RegSubReg in an instruction like
987/// `DefRC.DefSubReg = COPY RegSubReg`
988///
989/// When true is returned, the \p RewriteMap can be used by the client to
990/// retrieve all Def -> Use along the way up to the next source. Any found
991/// Use that is not itself a key for another entry, is the next source to
992/// use. During the search for the next source, multiple sources can be found
993/// given multiple incoming sources of a PHI instruction. In this case, we
994/// look in each PHI source for the next source; all found next sources must
995/// share the same register file as \p Reg and \p SubReg. The client should
996/// then be capable to rewrite all intermediate PHIs to get the next source.
997/// \return False if no alternative sources are available. True otherwise.
998bool PeepholeOptimizer::findNextSource(const TargetRegisterClass *DefRC,
999 unsigned DefSubReg,
1000 RegSubRegPair RegSubReg,
1001 RewriteMapTy &RewriteMap) {
1002 // Do not try to find a new source for a physical register.
1003 // So far we do not have any motivating example for doing that.
1004 // Thus, instead of maintaining untested code, we will revisit that if
1005 // that changes at some point.
1006 Register Reg = RegSubReg.Reg;
1007 RegSubRegPair CurSrcPair = RegSubReg;
1008 SmallVector<RegSubRegPair, 4> SrcToLook = {CurSrcPair};
1009
1010 unsigned PHICount = 0;
1011 do {
1012 CurSrcPair = SrcToLook.pop_back_val();
1013 // As explained above, do not handle physical registers
1014 if (CurSrcPair.Reg.isPhysical())
1015 return false;
1016
1017 ValueTracker ValTracker(CurSrcPair.Reg, CurSrcPair.SubReg, *MRI, TII);
1018
1019 // Follow the chain of copies until we find a more suitable source, a phi
1020 // or have to abort.
1021 while (true) {
1022 ValueTrackerResult Res = ValTracker.getNextSource();
1023 // Abort at the end of a chain (without finding a suitable source).
1024 if (!Res.isValid())
1025 return false;
1026
1027 // Insert the Def -> Use entry for the recently found source.
1028 auto [InsertPt, WasInserted] = RewriteMap.try_emplace(CurSrcPair, Res);
1029
1030 if (!WasInserted) {
1031 const ValueTrackerResult &CurSrcRes = InsertPt->second;
1032
1033 assert(CurSrcRes == Res && "ValueTrackerResult found must match");
1034 // An existent entry with multiple sources is a PHI cycle we must avoid.
1035 // Otherwise it's an entry with a valid next source we already found.
1036 if (CurSrcRes.getNumSources() > 1) {
1038 << "findNextSource: found PHI cycle, aborting...\n");
1039 return false;
1040 }
1041 break;
1042 }
1043
1044 // ValueTrackerResult usually have one source unless it's the result from
1045 // a PHI instruction. Add the found PHI edges to be looked up further.
1046 unsigned NumSrcs = Res.getNumSources();
1047 if (NumSrcs > 1) {
1048 PHICount++;
1049 if (PHICount >= RewritePHILimit) {
1050 LLVM_DEBUG(dbgs() << "findNextSource: PHI limit reached\n");
1051 return false;
1052 }
1053
1054 for (unsigned i = 0; i < NumSrcs; ++i)
1055 SrcToLook.push_back(Res.getSrc(i));
1056 break;
1057 }
1058
1059 CurSrcPair = Res.getSrc(0);
1060 // Do not extend the live-ranges of physical registers as they add
1061 // constraints to the register allocator. Moreover, if we want to extend
1062 // the live-range of a physical register, unlike SSA virtual register,
1063 // we will have to check that they aren't redefine before the related use.
1064 if (CurSrcPair.Reg.isPhysical())
1065 return false;
1066
1067 // Keep following the chain if the value isn't any better yet.
1068 const TargetRegisterClass *SrcRC = MRI->getRegClass(CurSrcPair.Reg);
1069 if (!TRI->shouldRewriteCopySrc(DefRC, DefSubReg, SrcRC,
1070 CurSrcPair.SubReg))
1071 continue;
1072
1073 // We currently cannot deal with subreg operands on PHI instructions
1074 // (see insertPHI()).
1075 if (PHICount > 0 && CurSrcPair.SubReg != 0)
1076 continue;
1077
1078 // We found a suitable source, and are done with this chain.
1079 break;
1080 }
1081 } while (!SrcToLook.empty());
1082
1083 // If we did not find a more suitable source, there is nothing to optimize.
1084 return CurSrcPair.Reg != Reg;
1085}
1086
1087/// Insert a PHI instruction with incoming edges \p SrcRegs that are
1088/// guaranteed to have the same register class. This is necessary whenever we
1089/// successfully traverse a PHI instruction and find suitable sources coming
1090/// from its edges. By inserting a new PHI, we provide a rewritten PHI def
1091/// suitable to be used in a new COPY instruction.
1093 const TargetInstrInfo &TII,
1094 const SmallVectorImpl<RegSubRegPair> &SrcRegs,
1095 MachineInstr &OrigPHI) {
1096 assert(!SrcRegs.empty() && "No sources to create a PHI instruction?");
1097
1098 const TargetRegisterClass *NewRC = MRI.getRegClass(SrcRegs[0].Reg);
1099 // NewRC is only correct if no subregisters are involved. findNextSource()
1100 // should have rejected those cases already.
1101 assert(SrcRegs[0].SubReg == 0 && "should not have subreg operand");
1102 Register NewVR = MRI.createVirtualRegister(NewRC);
1103 MachineBasicBlock *MBB = OrigPHI.getParent();
1104 MachineInstrBuilder MIB = BuildMI(*MBB, &OrigPHI, OrigPHI.getDebugLoc(),
1105 TII.get(TargetOpcode::PHI), NewVR);
1106
1107 unsigned MBBOpIdx = 2;
1108 for (const RegSubRegPair &RegPair : SrcRegs) {
1109 MIB.addReg(RegPair.Reg, 0, RegPair.SubReg);
1110 MIB.addMBB(OrigPHI.getOperand(MBBOpIdx).getMBB());
1111 // Since we're extended the lifetime of RegPair.Reg, clear the
1112 // kill flags to account for that and make RegPair.Reg reaches
1113 // the new PHI.
1114 MRI.clearKillFlags(RegPair.Reg);
1115 MBBOpIdx += 2;
1116 }
1117
1118 return *MIB;
1119}
1120
1121/// Given a \p Def.Reg and Def.SubReg pair, use \p RewriteMap to find
1122/// the new source to use for rewrite. If \p HandleMultipleSources is true and
1123/// multiple sources for a given \p Def are found along the way, we found a
1124/// PHI instructions that needs to be rewritten.
1125/// TODO: HandleMultipleSources should be removed once we test PHI handling
1126/// with coalescable copies.
1127static RegSubRegPair
1129 RegSubRegPair Def,
1130 const PeepholeOptimizer::RewriteMapTy &RewriteMap,
1131 bool HandleMultipleSources = true) {
1132 RegSubRegPair LookupSrc(Def.Reg, Def.SubReg);
1133 while (true) {
1134 ValueTrackerResult Res = RewriteMap.lookup(LookupSrc);
1135 // If there are no entries on the map, LookupSrc is the new source.
1136 if (!Res.isValid())
1137 return LookupSrc;
1138
1139 // There's only one source for this definition, keep searching...
1140 unsigned NumSrcs = Res.getNumSources();
1141 if (NumSrcs == 1) {
1142 LookupSrc.Reg = Res.getSrcReg(0);
1143 LookupSrc.SubReg = Res.getSrcSubReg(0);
1144 continue;
1145 }
1146
1147 // TODO: Remove once multiple srcs w/ coalescable copies are supported.
1148 if (!HandleMultipleSources)
1149 break;
1150
1151 // Multiple sources, recurse into each source to find a new source
1152 // for it. Then, rewrite the PHI accordingly to its new edges.
1154 for (unsigned i = 0; i < NumSrcs; ++i) {
1155 RegSubRegPair PHISrc(Res.getSrcReg(i), Res.getSrcSubReg(i));
1156 NewPHISrcs.push_back(
1157 getNewSource(MRI, TII, PHISrc, RewriteMap, HandleMultipleSources));
1158 }
1159
1160 // Build the new PHI node and return its def register as the new source.
1161 MachineInstr &OrigPHI = const_cast<MachineInstr &>(*Res.getInst());
1162 MachineInstr &NewPHI = insertPHI(*MRI, *TII, NewPHISrcs, OrigPHI);
1163 LLVM_DEBUG(dbgs() << "-- getNewSource\n");
1164 LLVM_DEBUG(dbgs() << " Replacing: " << OrigPHI);
1165 LLVM_DEBUG(dbgs() << " With: " << NewPHI);
1166 const MachineOperand &MODef = NewPHI.getOperand(0);
1167 return RegSubRegPair(MODef.getReg(), MODef.getSubReg());
1168 }
1169
1170 return RegSubRegPair(0, 0);
1171}
1172
1173bool PeepholeOptimizer::optimizeCoalescableCopyImpl(Rewriter &&CpyRewriter) {
1174 bool Changed = false;
1175 // Get the right rewriter for the current copy.
1176 // Rewrite each rewritable source.
1177 RegSubRegPair Dst;
1178 RegSubRegPair TrackPair;
1179 while (CpyRewriter.getNextRewritableSource(TrackPair, Dst)) {
1180 if (Dst.Reg.isPhysical()) {
1181 // Do not try to find a new source for a physical register.
1182 // So far we do not have any motivating example for doing that.
1183 // Thus, instead of maintaining untested code, we will revisit that if
1184 // that changes at some point.
1185 continue;
1186 }
1187
1188 const TargetRegisterClass *DefRC = MRI->getRegClass(Dst.Reg);
1189
1190 // Keep track of PHI nodes and its incoming edges when looking for sources.
1191 RewriteMapTy RewriteMap;
1192 // Try to find a more suitable source. If we failed to do so, or get the
1193 // actual source, move to the next source.
1194 if (!findNextSource(DefRC, Dst.SubReg, TrackPair, RewriteMap))
1195 continue;
1196
1197 // Get the new source to rewrite. TODO: Only enable handling of multiple
1198 // sources (PHIs) once we have a motivating example and testcases for it.
1199 RegSubRegPair NewSrc = getNewSource(MRI, TII, TrackPair, RewriteMap,
1200 /*HandleMultipleSources=*/false);
1201 assert(TrackPair.Reg != NewSrc.Reg &&
1202 "should not rewrite source to original value");
1203 if (!NewSrc.Reg)
1204 continue;
1205
1206 if (NewSrc.SubReg) {
1207 // Verify the register class supports the subregister index. ARM's
1208 // copy-like queries return register:subreg pairs where the register's
1209 // current class does not directly support the subregister index.
1210 const TargetRegisterClass *RC = MRI->getRegClass(NewSrc.Reg);
1211 const TargetRegisterClass *WithSubRC =
1212 TRI->getSubClassWithSubReg(RC, NewSrc.SubReg);
1213 if (!MRI->constrainRegClass(NewSrc.Reg, WithSubRC))
1214 continue;
1215 Changed = true;
1216 }
1217
1218 // Rewrite source.
1219 if (CpyRewriter.RewriteCurrentSource(NewSrc.Reg, NewSrc.SubReg)) {
1220 // We may have extended the live-range of NewSrc, account for that.
1221 MRI->clearKillFlags(NewSrc.Reg);
1222 Changed = true;
1223 }
1224 }
1225
1226 // TODO: We could have a clean-up method to tidy the instruction.
1227 // E.g., v0 = INSERT_SUBREG v1, v1.sub0, sub0
1228 // => v0 = COPY v1
1229 // Currently we haven't seen motivating example for that and we
1230 // want to avoid untested code.
1231 NumRewrittenCopies += Changed;
1232 return Changed;
1233}
1234
1235/// Optimize generic copy instructions to avoid cross register bank copy.
1236/// The optimization looks through a chain of copies and tries to find a source
1237/// that has a compatible register class.
1238/// Two register classes are considered to be compatible if they share the same
1239/// register bank.
1240/// New copies issued by this optimization are register allocator
1241/// friendly. This optimization does not remove any copy as it may
1242/// overconstrain the register allocator, but replaces some operands
1243/// when possible.
1244/// \pre isCoalescableCopy(*MI) is true.
1245/// \return True, when \p MI has been rewritten. False otherwise.
1246bool PeepholeOptimizer::optimizeCoalescableCopy(MachineInstr &MI) {
1247 assert(isCoalescableCopy(MI) && "Invalid argument");
1248 assert(MI.getDesc().getNumDefs() == 1 &&
1249 "Coalescer can understand multiple defs?!");
1250 const MachineOperand &MODef = MI.getOperand(0);
1251 // Do not rewrite physical definitions.
1252 if (MODef.getReg().isPhysical())
1253 return false;
1254
1255 switch (MI.getOpcode()) {
1256 case TargetOpcode::COPY:
1257 return optimizeCoalescableCopyImpl(CopyRewriter(MI));
1258 case TargetOpcode::INSERT_SUBREG:
1259 return optimizeCoalescableCopyImpl(InsertSubregRewriter(MI));
1260 case TargetOpcode::EXTRACT_SUBREG:
1261 return optimizeCoalescableCopyImpl(ExtractSubregRewriter(MI, *TII));
1262 case TargetOpcode::REG_SEQUENCE:
1263 return optimizeCoalescableCopyImpl(RegSequenceRewriter(MI));
1264 default:
1265 // Handle uncoalescable copy-like instructions.
1266 if (MI.isBitcast() || MI.isRegSequenceLike() || MI.isInsertSubregLike() ||
1267 MI.isExtractSubregLike())
1268 return optimizeCoalescableCopyImpl(UncoalescableRewriter(MI));
1269 return false;
1270 }
1271}
1272
1273/// Rewrite the source found through \p Def, by using the \p RewriteMap
1274/// and create a new COPY instruction. More info about RewriteMap in
1275/// PeepholeOptimizer::findNextSource. Right now this is only used to handle
1276/// Uncoalescable copies, since they are copy like instructions that aren't
1277/// recognized by the register allocator.
1278MachineInstr &PeepholeOptimizer::rewriteSource(MachineInstr &CopyLike,
1279 RegSubRegPair Def,
1280 RewriteMapTy &RewriteMap) {
1281 assert(!Def.Reg.isPhysical() && "We do not rewrite physical registers");
1282
1283 // Find the new source to use in the COPY rewrite.
1284 RegSubRegPair NewSrc = getNewSource(MRI, TII, Def, RewriteMap);
1285
1286 // Insert the COPY.
1287 const TargetRegisterClass *DefRC = MRI->getRegClass(Def.Reg);
1288 Register NewVReg = MRI->createVirtualRegister(DefRC);
1289
1290 if (NewSrc.SubReg) {
1291 const TargetRegisterClass *NewSrcRC = MRI->getRegClass(NewSrc.Reg);
1292 const TargetRegisterClass *WithSubRC =
1293 TRI->getSubClassWithSubReg(NewSrcRC, NewSrc.SubReg);
1294
1295 // The new source may not directly support the subregister, but we should be
1296 // able to assume it is constrainable to support the subregister (otherwise
1297 // ValueTracker was lying and reported a useless value).
1298 if (!MRI->constrainRegClass(NewSrc.Reg, WithSubRC))
1299 llvm_unreachable("replacement register cannot support subregister");
1300 }
1301
1302 MachineInstr *NewCopy =
1303 BuildMI(*CopyLike.getParent(), &CopyLike, CopyLike.getDebugLoc(),
1304 TII->get(TargetOpcode::COPY), NewVReg)
1305 .addReg(NewSrc.Reg, 0, NewSrc.SubReg);
1306
1307 if (Def.SubReg) {
1308 NewCopy->getOperand(0).setSubReg(Def.SubReg);
1309 NewCopy->getOperand(0).setIsUndef();
1310 }
1311
1312 LLVM_DEBUG(dbgs() << "-- RewriteSource\n");
1313 LLVM_DEBUG(dbgs() << " Replacing: " << CopyLike);
1314 LLVM_DEBUG(dbgs() << " With: " << *NewCopy);
1315 MRI->replaceRegWith(Def.Reg, NewVReg);
1316 MRI->clearKillFlags(NewVReg);
1317
1318 // We extended the lifetime of NewSrc.Reg, clear the kill flags to
1319 // account for that.
1320 MRI->clearKillFlags(NewSrc.Reg);
1321
1322 return *NewCopy;
1323}
1324
1325/// Optimize copy-like instructions to create
1326/// register coalescer friendly instruction.
1327/// The optimization tries to kill-off the \p MI by looking
1328/// through a chain of copies to find a source that has a compatible
1329/// register class.
1330/// If such a source is found, it replace \p MI by a generic COPY
1331/// operation.
1332/// \pre isUncoalescableCopy(*MI) is true.
1333/// \return True, when \p MI has been optimized. In that case, \p MI has
1334/// been removed from its parent.
1335/// All COPY instructions created, are inserted in \p LocalMIs.
1336bool PeepholeOptimizer::optimizeUncoalescableCopy(
1337 MachineInstr &MI, SmallPtrSetImpl<MachineInstr *> &LocalMIs) {
1338 assert(isUncoalescableCopy(MI) && "Invalid argument");
1339 UncoalescableRewriter CpyRewriter(MI);
1340
1341 // Rewrite each rewritable source by generating new COPYs. This works
1342 // differently from optimizeCoalescableCopy since it first makes sure that all
1343 // definitions can be rewritten.
1344 RewriteMapTy RewriteMap;
1345 RegSubRegPair Src;
1347 SmallVector<RegSubRegPair, 4> RewritePairs;
1348 while (CpyRewriter.getNextRewritableSource(Src, Def)) {
1349 // If a physical register is here, this is probably for a good reason.
1350 // Do not rewrite that.
1351 if (Def.Reg.isPhysical())
1352 return false;
1353
1354 // FIXME: Uncoalescable copies are treated differently by
1355 // UncoalescableRewriter, and this probably should not share
1356 // API. getNextRewritableSource really finds rewritable defs.
1357 const TargetRegisterClass *DefRC = MRI->getRegClass(Def.Reg);
1358
1359 // If we do not know how to rewrite this definition, there is no point
1360 // in trying to kill this instruction.
1361 if (!findNextSource(DefRC, Def.SubReg, Def, RewriteMap))
1362 return false;
1363
1364 RewritePairs.push_back(Def);
1365 }
1366
1367 // The change is possible for all defs, do it.
1368 for (const RegSubRegPair &Def : RewritePairs) {
1369 // Rewrite the "copy" in a way the register coalescer understands.
1370 MachineInstr &NewCopy = rewriteSource(MI, Def, RewriteMap);
1371 LocalMIs.insert(&NewCopy);
1372 }
1373
1374 // MI is now dead.
1375 LLVM_DEBUG(dbgs() << "Deleting uncoalescable copy: " << MI);
1376 MI.eraseFromParent();
1377 ++NumUncoalescableCopies;
1378 return true;
1379}
1380
1381/// Check whether MI is a candidate for folding into a later instruction.
1382/// We only fold loads to virtual registers and the virtual register defined
1383/// has a single user.
1384bool PeepholeOptimizer::isLoadFoldable(
1385 MachineInstr &MI, SmallSet<Register, 16> &FoldAsLoadDefCandidates) {
1386 if (!MI.canFoldAsLoad() || !MI.mayLoad())
1387 return false;
1388 const MCInstrDesc &MCID = MI.getDesc();
1389 if (MCID.getNumDefs() != 1)
1390 return false;
1391
1392 Register Reg = MI.getOperand(0).getReg();
1393 // To reduce compilation time, we check MRI->hasOneNonDBGUser when inserting
1394 // loads. It should be checked when processing uses of the load, since
1395 // uses can be removed during peephole.
1396 if (Reg.isVirtual() && !MI.getOperand(0).getSubReg() &&
1397 MRI->hasOneNonDBGUser(Reg)) {
1398 FoldAsLoadDefCandidates.insert(Reg);
1399 return true;
1400 }
1401 return false;
1402}
1403
1404bool PeepholeOptimizer::isMoveImmediate(
1405 MachineInstr &MI, SmallSet<Register, 4> &ImmDefRegs,
1406 DenseMap<Register, MachineInstr *> &ImmDefMIs) {
1407 const MCInstrDesc &MCID = MI.getDesc();
1408 if (MCID.getNumDefs() != 1 || !MI.getOperand(0).isReg())
1409 return false;
1410 Register Reg = MI.getOperand(0).getReg();
1411 if (!Reg.isVirtual())
1412 return false;
1413
1414 int64_t ImmVal;
1415 if (!MI.isMoveImmediate() && !TII->getConstValDefinedInReg(MI, Reg, ImmVal))
1416 return false;
1417
1418 ImmDefMIs.insert(std::make_pair(Reg, &MI));
1419 ImmDefRegs.insert(Reg);
1420 return true;
1421}
1422
1423/// Try folding register operands that are defined by move immediate
1424/// instructions, i.e. a trivial constant folding optimization, if
1425/// and only if the def and use are in the same BB.
1426bool PeepholeOptimizer::foldImmediate(
1427 MachineInstr &MI, SmallSet<Register, 4> &ImmDefRegs,
1428 DenseMap<Register, MachineInstr *> &ImmDefMIs, bool &Deleted) {
1429 Deleted = false;
1430 for (unsigned i = 0, e = MI.getDesc().getNumOperands(); i != e; ++i) {
1431 MachineOperand &MO = MI.getOperand(i);
1432 if (!MO.isReg() || MO.isDef())
1433 continue;
1434 Register Reg = MO.getReg();
1435 if (!Reg.isVirtual())
1436 continue;
1437 if (ImmDefRegs.count(Reg) == 0)
1438 continue;
1439 DenseMap<Register, MachineInstr *>::iterator II = ImmDefMIs.find(Reg);
1440 assert(II != ImmDefMIs.end() && "couldn't find immediate definition");
1441 if (TII->foldImmediate(MI, *II->second, Reg, MRI)) {
1442 ++NumImmFold;
1443 // foldImmediate can delete ImmDefMI if MI was its only user. If ImmDefMI
1444 // is not deleted, and we happened to get a same MI, we can delete MI and
1445 // replace its users.
1446 if (MRI->getVRegDef(Reg) &&
1447 MI.isIdenticalTo(*II->second, MachineInstr::IgnoreVRegDefs)) {
1448 Register DstReg = MI.getOperand(0).getReg();
1449 if (DstReg.isVirtual() &&
1450 MRI->getRegClass(DstReg) == MRI->getRegClass(Reg)) {
1451 MRI->replaceRegWith(DstReg, Reg);
1452 MI.eraseFromParent();
1453 Deleted = true;
1454 }
1455 }
1456 return true;
1457 }
1458 }
1459 return false;
1460}
1461
1462// FIXME: This is very simple and misses some cases which should be handled when
1463// motivating examples are found.
1464//
1465// The copy rewriting logic should look at uses as well as defs and be able to
1466// eliminate copies across blocks.
1467//
1468// Later copies that are subregister extracts will also not be eliminated since
1469// only the first copy is considered.
1470//
1471// e.g.
1472// %1 = COPY %0
1473// %2 = COPY %0:sub1
1474//
1475// Should replace %2 uses with %1:sub1
1476bool PeepholeOptimizer::foldRedundantCopy(MachineInstr &MI) {
1477 assert(MI.isCopy() && "expected a COPY machine instruction");
1478
1479 RegSubRegPair SrcPair;
1480 if (!getCopySrc(MI, SrcPair))
1481 return false;
1482
1483 Register DstReg = MI.getOperand(0).getReg();
1484 if (!DstReg.isVirtual())
1485 return false;
1486
1487 if (CopySrcMIs.insert(std::make_pair(SrcPair, &MI)).second) {
1488 // First copy of this reg seen.
1489 return false;
1490 }
1491
1492 MachineInstr *PrevCopy = CopySrcMIs.find(SrcPair)->second;
1493
1494 assert(SrcPair.SubReg == PrevCopy->getOperand(1).getSubReg() &&
1495 "Unexpected mismatching subreg!");
1496
1497 Register PrevDstReg = PrevCopy->getOperand(0).getReg();
1498
1499 // Only replace if the copy register class is the same.
1500 //
1501 // TODO: If we have multiple copies to different register classes, we may want
1502 // to track multiple copies of the same source register.
1503 if (MRI->getRegClass(DstReg) != MRI->getRegClass(PrevDstReg))
1504 return false;
1505
1506 MRI->replaceRegWith(DstReg, PrevDstReg);
1507
1508 // Lifetime of the previous copy has been extended.
1509 MRI->clearKillFlags(PrevDstReg);
1510 return true;
1511}
1512
1513bool PeepholeOptimizer::isNAPhysCopy(Register Reg) {
1514 return Reg.isPhysical() && !MRI->isAllocatable(Reg);
1515}
1516
1517bool PeepholeOptimizer::foldRedundantNAPhysCopy(
1518 MachineInstr &MI, DenseMap<Register, MachineInstr *> &NAPhysToVirtMIs) {
1519 assert(MI.isCopy() && "expected a COPY machine instruction");
1520
1522 return false;
1523
1524 Register DstReg = MI.getOperand(0).getReg();
1525 Register SrcReg = MI.getOperand(1).getReg();
1526 if (isNAPhysCopy(SrcReg) && DstReg.isVirtual()) {
1527 // %vreg = COPY $physreg
1528 // Avoid using a datastructure which can track multiple live non-allocatable
1529 // phys->virt copies since LLVM doesn't seem to do this.
1530 NAPhysToVirtMIs.insert({SrcReg, &MI});
1531 return false;
1532 }
1533
1534 if (!(SrcReg.isVirtual() && isNAPhysCopy(DstReg)))
1535 return false;
1536
1537 // $physreg = COPY %vreg
1538 auto PrevCopy = NAPhysToVirtMIs.find(DstReg);
1539 if (PrevCopy == NAPhysToVirtMIs.end()) {
1540 // We can't remove the copy: there was an intervening clobber of the
1541 // non-allocatable physical register after the copy to virtual.
1542 LLVM_DEBUG(dbgs() << "NAPhysCopy: intervening clobber forbids erasing "
1543 << MI);
1544 return false;
1545 }
1546
1547 Register PrevDstReg = PrevCopy->second->getOperand(0).getReg();
1548 if (PrevDstReg == SrcReg) {
1549 // Remove the virt->phys copy: we saw the virtual register definition, and
1550 // the non-allocatable physical register's state hasn't changed since then.
1551 LLVM_DEBUG(dbgs() << "NAPhysCopy: erasing " << MI);
1552 ++NumNAPhysCopies;
1553 return true;
1554 }
1555
1556 // Potential missed optimization opportunity: we saw a different virtual
1557 // register get a copy of the non-allocatable physical register, and we only
1558 // track one such copy. Avoid getting confused by this new non-allocatable
1559 // physical register definition, and remove it from the tracked copies.
1560 LLVM_DEBUG(dbgs() << "NAPhysCopy: missed opportunity " << MI);
1561 NAPhysToVirtMIs.erase(PrevCopy);
1562 return false;
1563}
1564
1565/// \bried Returns true if \p MO is a virtual register operand.
1567 return MO.isReg() && MO.getReg().isVirtual();
1568}
1569
1570bool PeepholeOptimizer::findTargetRecurrence(
1571 Register Reg, const SmallSet<Register, 2> &TargetRegs,
1572 RecurrenceCycle &RC) {
1573 // Recurrence found if Reg is in TargetRegs.
1574 if (TargetRegs.count(Reg))
1575 return true;
1576
1577 // TODO: Curerntly, we only allow the last instruction of the recurrence
1578 // cycle (the instruction that feeds the PHI instruction) to have more than
1579 // one uses to guarantee that commuting operands does not tie registers
1580 // with overlapping live range. Once we have actual live range info of
1581 // each register, this constraint can be relaxed.
1582 if (!MRI->hasOneNonDBGUse(Reg))
1583 return false;
1584
1585 // Give up if the reccurrence chain length is longer than the limit.
1586 if (RC.size() >= MaxRecurrenceChain)
1587 return false;
1588
1589 MachineInstr &MI = *(MRI->use_instr_nodbg_begin(Reg));
1590 unsigned Idx = MI.findRegisterUseOperandIdx(Reg, /*TRI=*/nullptr);
1591
1592 // Only interested in recurrences whose instructions have only one def, which
1593 // is a virtual register.
1594 if (MI.getDesc().getNumDefs() != 1)
1595 return false;
1596
1597 MachineOperand &DefOp = MI.getOperand(0);
1598 if (!isVirtualRegisterOperand(DefOp))
1599 return false;
1600
1601 // Check if def operand of MI is tied to any use operand. We are only
1602 // interested in the case that all the instructions in the recurrence chain
1603 // have there def operand tied with one of the use operand.
1604 unsigned TiedUseIdx;
1605 if (!MI.isRegTiedToUseOperand(0, &TiedUseIdx))
1606 return false;
1607
1608 if (Idx == TiedUseIdx) {
1609 RC.push_back(RecurrenceInstr(&MI));
1610 return findTargetRecurrence(DefOp.getReg(), TargetRegs, RC);
1611 } else {
1612 // If Idx is not TiedUseIdx, check if Idx is commutable with TiedUseIdx.
1613 unsigned CommIdx = TargetInstrInfo::CommuteAnyOperandIndex;
1614 if (TII->findCommutedOpIndices(MI, Idx, CommIdx) && CommIdx == TiedUseIdx) {
1615 RC.push_back(RecurrenceInstr(&MI, Idx, CommIdx));
1616 return findTargetRecurrence(DefOp.getReg(), TargetRegs, RC);
1617 }
1618 }
1619
1620 return false;
1621}
1622
1623/// Phi instructions will eventually be lowered to copy instructions.
1624/// If phi is in a loop header, a recurrence may formulated around the source
1625/// and destination of the phi. For such case commuting operands of the
1626/// instructions in the recurrence may enable coalescing of the copy instruction
1627/// generated from the phi. For example, if there is a recurrence of
1628///
1629/// LoopHeader:
1630/// %1 = phi(%0, %100)
1631/// LoopLatch:
1632/// %0<def, tied1> = ADD %2<def, tied0>, %1
1633///
1634/// , the fact that %0 and %2 are in the same tied operands set makes
1635/// the coalescing of copy instruction generated from the phi in
1636/// LoopHeader(i.e. %1 = COPY %0) impossible, because %1 and
1637/// %2 have overlapping live range. This introduces additional move
1638/// instruction to the final assembly. However, if we commute %2 and
1639/// %1 of ADD instruction, the redundant move instruction can be
1640/// avoided.
1641bool PeepholeOptimizer::optimizeRecurrence(MachineInstr &PHI) {
1642 SmallSet<Register, 2> TargetRegs;
1643 for (unsigned Idx = 1; Idx < PHI.getNumOperands(); Idx += 2) {
1644 MachineOperand &MO = PHI.getOperand(Idx);
1645 assert(isVirtualRegisterOperand(MO) && "Invalid PHI instruction");
1646 TargetRegs.insert(MO.getReg());
1647 }
1648
1649 bool Changed = false;
1650 RecurrenceCycle RC;
1651 if (findTargetRecurrence(PHI.getOperand(0).getReg(), TargetRegs, RC)) {
1652 // Commutes operands of instructions in RC if necessary so that the copy to
1653 // be generated from PHI can be coalesced.
1654 LLVM_DEBUG(dbgs() << "Optimize recurrence chain from " << PHI);
1655 for (auto &RI : RC) {
1656 LLVM_DEBUG(dbgs() << "\tInst: " << *(RI.getMI()));
1657 auto CP = RI.getCommutePair();
1658 if (CP) {
1659 Changed = true;
1660 TII->commuteInstruction(*(RI.getMI()), false, (*CP).first,
1661 (*CP).second);
1662 LLVM_DEBUG(dbgs() << "\t\tCommuted: " << *(RI.getMI()));
1663 }
1664 }
1665 }
1666
1667 return Changed;
1668}
1669
1670PreservedAnalyses
1673 MFPropsModifier _(*this, MF);
1674 auto *DT =
1675 Aggressive ? &MFAM.getResult<MachineDominatorTreeAnalysis>(MF) : nullptr;
1676 auto *MLI = &MFAM.getResult<MachineLoopAnalysis>(MF);
1677 PeepholeOptimizer Impl(DT, MLI);
1678 bool Changed = Impl.run(MF);
1679 if (!Changed)
1680 return PreservedAnalyses::all();
1681
1683 PA.preserve<MachineDominatorTreeAnalysis>();
1684 PA.preserve<MachineLoopAnalysis>();
1685 PA.preserveSet<CFGAnalyses>();
1686 return PA;
1687}
1688
1689bool PeepholeOptimizerLegacy::runOnMachineFunction(MachineFunction &MF) {
1690 if (skipFunction(MF.getFunction()))
1691 return false;
1692 auto *DT = Aggressive
1693 ? &getAnalysis<MachineDominatorTreeWrapperPass>().getDomTree()
1694 : nullptr;
1695 auto *MLI = &getAnalysis<MachineLoopInfoWrapperPass>().getLI();
1696 PeepholeOptimizer Impl(DT, MLI);
1697 return Impl.run(MF);
1698}
1699
1700bool PeepholeOptimizer::run(MachineFunction &MF) {
1701
1702 LLVM_DEBUG(dbgs() << "********** PEEPHOLE OPTIMIZER **********\n");
1703 LLVM_DEBUG(dbgs() << "********** Function: " << MF.getName() << '\n');
1704
1705 if (DisablePeephole)
1706 return false;
1707
1708 TII = MF.getSubtarget().getInstrInfo();
1710 MRI = &MF.getRegInfo();
1711 MF.setDelegate(this);
1712
1713 bool Changed = false;
1714
1715 for (MachineBasicBlock &MBB : MF) {
1716 bool SeenMoveImm = false;
1717
1718 // During this forward scan, at some point it needs to answer the question
1719 // "given a pointer to an MI in the current BB, is it located before or
1720 // after the current instruction".
1721 // To perform this, the following set keeps track of the MIs already seen
1722 // during the scan, if a MI is not in the set, it is assumed to be located
1723 // after. Newly created MIs have to be inserted in the set as well.
1725 SmallSet<Register, 4> ImmDefRegs;
1727 SmallSet<Register, 16> FoldAsLoadDefCandidates;
1728
1729 // Track when a non-allocatable physical register is copied to a virtual
1730 // register so that useless moves can be removed.
1731 //
1732 // $physreg is the map index; MI is the last valid `%vreg = COPY $physreg`
1733 // without any intervening re-definition of $physreg.
1734 DenseMap<Register, MachineInstr *> NAPhysToVirtMIs;
1735
1736 CopySrcMIs.clear();
1737
1738 bool IsLoopHeader = MLI->isLoopHeader(&MBB);
1739
1740 for (MachineBasicBlock::iterator MII = MBB.begin(), MIE = MBB.end();
1741 MII != MIE;) {
1742 MachineInstr *MI = &*MII;
1743 // We may be erasing MI below, increment MII now.
1744 ++MII;
1745 LocalMIs.insert(MI);
1746
1747 // Skip debug instructions. They should not affect this peephole
1748 // optimization.
1749 if (MI->isDebugInstr())
1750 continue;
1751
1752 if (MI->isPosition())
1753 continue;
1754
1755 if (IsLoopHeader && MI->isPHI()) {
1756 if (optimizeRecurrence(*MI)) {
1757 Changed = true;
1758 continue;
1759 }
1760 }
1761
1762 if (!MI->isCopy()) {
1763 for (const MachineOperand &MO : MI->operands()) {
1764 // Visit all operands: definitions can be implicit or explicit.
1765 if (MO.isReg()) {
1766 Register Reg = MO.getReg();
1767 if (MO.isDef() && isNAPhysCopy(Reg)) {
1768 const auto &Def = NAPhysToVirtMIs.find(Reg);
1769 if (Def != NAPhysToVirtMIs.end()) {
1770 // A new definition of the non-allocatable physical register
1771 // invalidates previous copies.
1773 << "NAPhysCopy: invalidating because of " << *MI);
1774 NAPhysToVirtMIs.erase(Def);
1775 }
1776 }
1777 } else if (MO.isRegMask()) {
1778 const uint32_t *RegMask = MO.getRegMask();
1779 for (auto &RegMI : NAPhysToVirtMIs) {
1780 Register Def = RegMI.first;
1781 if (MachineOperand::clobbersPhysReg(RegMask, Def)) {
1783 << "NAPhysCopy: invalidating because of " << *MI);
1784 NAPhysToVirtMIs.erase(Def);
1785 }
1786 }
1787 }
1788 }
1789 }
1790
1791 if (MI->isImplicitDef() || MI->isKill())
1792 continue;
1793
1794 if (MI->isInlineAsm() || MI->hasUnmodeledSideEffects()) {
1795 // Blow away all non-allocatable physical registers knowledge since we
1796 // don't know what's correct anymore.
1797 //
1798 // FIXME: handle explicit asm clobbers.
1799 LLVM_DEBUG(dbgs() << "NAPhysCopy: blowing away all info due to "
1800 << *MI);
1801 NAPhysToVirtMIs.clear();
1802 }
1803
1804 if ((isUncoalescableCopy(*MI) &&
1805 optimizeUncoalescableCopy(*MI, LocalMIs)) ||
1806 (MI->isCompare() && optimizeCmpInstr(*MI)) ||
1807 (MI->isSelect() && optimizeSelect(*MI, LocalMIs))) {
1808 // MI is deleted.
1809 LocalMIs.erase(MI);
1810 Changed = true;
1811 continue;
1812 }
1813
1814 if (MI->isConditionalBranch() && optimizeCondBranch(*MI)) {
1815 Changed = true;
1816 continue;
1817 }
1818
1819 if (isCoalescableCopy(*MI) && optimizeCoalescableCopy(*MI)) {
1820 // MI is just rewritten.
1821 Changed = true;
1822 continue;
1823 }
1824
1825 if (MI->isCopy() && (foldRedundantCopy(*MI) ||
1826 foldRedundantNAPhysCopy(*MI, NAPhysToVirtMIs))) {
1827 LocalMIs.erase(MI);
1828 LLVM_DEBUG(dbgs() << "Deleting redundant copy: " << *MI << "\n");
1829 MI->eraseFromParent();
1830 Changed = true;
1831 continue;
1832 }
1833
1834 if (isMoveImmediate(*MI, ImmDefRegs, ImmDefMIs)) {
1835 SeenMoveImm = true;
1836 } else {
1837 Changed |= optimizeExtInstr(*MI, MBB, LocalMIs);
1838 // optimizeExtInstr might have created new instructions after MI
1839 // and before the already incremented MII. Adjust MII so that the
1840 // next iteration sees the new instructions.
1841 MII = MI;
1842 ++MII;
1843 if (SeenMoveImm) {
1844 bool Deleted;
1845 Changed |= foldImmediate(*MI, ImmDefRegs, ImmDefMIs, Deleted);
1846 if (Deleted) {
1847 LocalMIs.erase(MI);
1848 continue;
1849 }
1850 }
1851 }
1852
1853 // Check whether MI is a load candidate for folding into a later
1854 // instruction. If MI is not a candidate, check whether we can fold an
1855 // earlier load into MI.
1856 if (!isLoadFoldable(*MI, FoldAsLoadDefCandidates) &&
1857 !FoldAsLoadDefCandidates.empty()) {
1858
1859 // We visit each operand even after successfully folding a previous
1860 // one. This allows us to fold multiple loads into a single
1861 // instruction. We do assume that optimizeLoadInstr doesn't insert
1862 // foldable uses earlier in the argument list. Since we don't restart
1863 // iteration, we'd miss such cases.
1864 const MCInstrDesc &MIDesc = MI->getDesc();
1865 for (unsigned i = MIDesc.getNumDefs(); i != MI->getNumOperands(); ++i) {
1866 const MachineOperand &MOp = MI->getOperand(i);
1867 if (!MOp.isReg())
1868 continue;
1869 Register FoldAsLoadDefReg = MOp.getReg();
1870 if (FoldAsLoadDefCandidates.count(FoldAsLoadDefReg)) {
1871 // We need to fold load after optimizeCmpInstr, since
1872 // optimizeCmpInstr can enable folding by converting SUB to CMP.
1873 // Save FoldAsLoadDefReg because optimizeLoadInstr() resets it and
1874 // we need it for markUsesInDebugValueAsUndef().
1875 Register FoldedReg = FoldAsLoadDefReg;
1876 MachineInstr *DefMI = nullptr;
1877 if (MachineInstr *FoldMI =
1878 TII->optimizeLoadInstr(*MI, MRI, FoldAsLoadDefReg, DefMI)) {
1879 // Update LocalMIs since we replaced MI with FoldMI and deleted
1880 // DefMI.
1881 LLVM_DEBUG(dbgs() << "Replacing: " << *MI);
1882 LLVM_DEBUG(dbgs() << " With: " << *FoldMI);
1883 LocalMIs.erase(MI);
1884 LocalMIs.erase(DefMI);
1885 LocalMIs.insert(FoldMI);
1886 // Update the call info.
1887 if (MI->shouldUpdateAdditionalCallInfo())
1888 MI->getMF()->moveAdditionalCallInfo(MI, FoldMI);
1889 MI->eraseFromParent();
1891 MRI->markUsesInDebugValueAsUndef(FoldedReg);
1892 FoldAsLoadDefCandidates.erase(FoldedReg);
1893 ++NumLoadFold;
1894
1895 // MI is replaced with FoldMI so we can continue trying to fold
1896 Changed = true;
1897 MI = FoldMI;
1898 }
1899 }
1900 }
1901 }
1902
1903 // If we run into an instruction we can't fold across, discard
1904 // the load candidates. Note: We might be able to fold *into* this
1905 // instruction, so this needs to be after the folding logic.
1906 if (MI->isLoadFoldBarrier()) {
1907 LLVM_DEBUG(dbgs() << "Encountered load fold barrier on " << *MI);
1908 FoldAsLoadDefCandidates.clear();
1909 }
1910 }
1911 }
1912
1913 MF.resetDelegate(this);
1914 return Changed;
1915}
1916
1917ValueTrackerResult ValueTracker::getNextSourceFromCopy() {
1918 assert(Def->isCopy() && "Invalid definition");
1919 // Copy instruction are supposed to be: Def = Src.
1920 // If someone breaks this assumption, bad things will happen everywhere.
1921 // There may be implicit uses preventing the copy to be moved across
1922 // some target specific register definitions
1923 assert(Def->getNumOperands() - Def->getNumImplicitOperands() == 2 &&
1924 "Invalid number of operands");
1925 assert(!Def->hasImplicitDef() && "Only implicit uses are allowed");
1926 assert(!Def->getOperand(DefIdx).getSubReg() && "no subregister defs in SSA");
1927
1928 // Otherwise, we want the whole source.
1929 const MachineOperand &Src = Def->getOperand(1);
1930 if (Src.isUndef())
1931 return ValueTrackerResult();
1932 return ValueTrackerResult(Src.getReg(), Src.getSubReg());
1933}
1934
1935ValueTrackerResult ValueTracker::getNextSourceFromBitcast() {
1936 assert(Def->isBitcast() && "Invalid definition");
1937
1938 // Bail if there are effects that a plain copy will not expose.
1939 if (Def->mayRaiseFPException() || Def->hasUnmodeledSideEffects())
1940 return ValueTrackerResult();
1941
1942 // Bitcasts with more than one def are not supported.
1943 if (Def->getDesc().getNumDefs() != 1)
1944 return ValueTrackerResult();
1945
1946 assert(!Def->getOperand(DefIdx).getSubReg() && "no subregister defs in SSA");
1947
1948 unsigned SrcIdx = Def->getNumOperands();
1949 for (unsigned OpIdx = DefIdx + 1, EndOpIdx = SrcIdx; OpIdx != EndOpIdx;
1950 ++OpIdx) {
1951 const MachineOperand &MO = Def->getOperand(OpIdx);
1952 if (!MO.isReg() || !MO.getReg())
1953 continue;
1954 // Ignore dead implicit defs.
1955 if (MO.isImplicit() && MO.isDead())
1956 continue;
1957 assert(!MO.isDef() && "We should have skipped all the definitions by now");
1958 if (SrcIdx != EndOpIdx)
1959 // Multiple sources?
1960 return ValueTrackerResult();
1961 SrcIdx = OpIdx;
1962 }
1963
1964 // In some rare case, Def has no input, SrcIdx is out of bound,
1965 // getOperand(SrcIdx) will fail below.
1966 if (SrcIdx >= Def->getNumOperands())
1967 return ValueTrackerResult();
1968
1969 const MachineOperand &DefOp = Def->getOperand(DefIdx);
1970
1971 // Stop when any user of the bitcast is a SUBREG_TO_REG, replacing with a COPY
1972 // will break the assumed guarantees for the upper bits.
1973 for (const MachineInstr &UseMI : MRI.use_nodbg_instructions(DefOp.getReg())) {
1974 if (UseMI.isSubregToReg())
1975 return ValueTrackerResult();
1976 }
1977
1978 const MachineOperand &Src = Def->getOperand(SrcIdx);
1979 if (Src.isUndef())
1980 return ValueTrackerResult();
1981 return ValueTrackerResult(Src.getReg(), Src.getSubReg());
1982}
1983
1984ValueTrackerResult ValueTracker::getNextSourceFromRegSequence() {
1985 assert((Def->isRegSequence() || Def->isRegSequenceLike()) &&
1986 "Invalid definition");
1987
1988 assert(!Def->getOperand(DefIdx).getSubReg() && "illegal subregister def");
1989
1991 if (!TII->getRegSequenceInputs(*Def, DefIdx, RegSeqInputRegs))
1992 return ValueTrackerResult();
1993
1994 // We are looking at:
1995 // Def = REG_SEQUENCE v0, sub0, v1, sub1, ...
1996 //
1997 // Check if one of the operands exactly defines the subreg we are interested
1998 // in.
1999 for (const RegSubRegPairAndIdx &RegSeqInput : RegSeqInputRegs) {
2000 if (RegSeqInput.SubIdx == DefSubReg)
2001 return ValueTrackerResult(RegSeqInput.Reg, RegSeqInput.SubReg);
2002 }
2003
2004 const TargetRegisterInfo *TRI = MRI.getTargetRegisterInfo();
2005
2006 // If we did not find an exact match, see if we can do a composition to
2007 // extract a sub-subregister.
2008 for (const RegSubRegPairAndIdx &RegSeqInput : RegSeqInputRegs) {
2009 LaneBitmask DefMask = TRI->getSubRegIndexLaneMask(DefSubReg);
2010 LaneBitmask ThisOpRegMask = TRI->getSubRegIndexLaneMask(RegSeqInput.SubIdx);
2011
2012 // Check that this extract reads a subset of this single reg_sequence input.
2013 //
2014 // FIXME: We should be able to filter this in terms of the indexes directly
2015 // without checking the lanemasks.
2016 if ((DefMask & ThisOpRegMask) != DefMask)
2017 continue;
2018
2019 unsigned ReverseDefCompose =
2020 TRI->reverseComposeSubRegIndices(RegSeqInput.SubIdx, DefSubReg);
2021 if (!ReverseDefCompose)
2022 continue;
2023
2024 unsigned ComposedDefInSrcReg1 =
2025 TRI->composeSubRegIndices(RegSeqInput.SubReg, ReverseDefCompose);
2026
2027 // TODO: We should be able to defer checking if the result register class
2028 // supports the index to continue looking for a rewritable source.
2029 //
2030 // TODO: Should we modify the register class to support the index?
2031 const TargetRegisterClass *SrcRC = MRI.getRegClass(RegSeqInput.Reg);
2032 const TargetRegisterClass *SrcWithSubRC =
2033 TRI->getSubClassWithSubReg(SrcRC, ComposedDefInSrcReg1);
2034 if (SrcRC != SrcWithSubRC)
2035 return ValueTrackerResult();
2036
2037 return ValueTrackerResult(RegSeqInput.Reg, ComposedDefInSrcReg1);
2038 }
2039
2040 // If the subreg we are tracking is super-defined by another subreg,
2041 // we could follow this value. However, this would require to compose
2042 // the subreg and we do not do that for now.
2043 return ValueTrackerResult();
2044}
2045
2046ValueTrackerResult ValueTracker::getNextSourceFromInsertSubreg() {
2047 assert((Def->isInsertSubreg() || Def->isInsertSubregLike()) &&
2048 "Invalid definition");
2049 assert(!Def->getOperand(DefIdx).getSubReg() && "no subreg defs in SSA");
2050
2052 RegSubRegPairAndIdx InsertedReg;
2053 if (!TII->getInsertSubregInputs(*Def, DefIdx, BaseReg, InsertedReg))
2054 return ValueTrackerResult();
2055
2056 // We are looking at:
2057 // Def = INSERT_SUBREG v0, v1, sub1
2058 // There are two cases:
2059 // 1. DefSubReg == sub1, get v1.
2060 // 2. DefSubReg != sub1, the value may be available through v0.
2061
2062 // #1 Check if the inserted register matches the required sub index.
2063 if (InsertedReg.SubIdx == DefSubReg) {
2064 return ValueTrackerResult(InsertedReg.Reg, InsertedReg.SubReg);
2065 }
2066 // #2 Otherwise, if the sub register we are looking for is not partial
2067 // defined by the inserted element, we can look through the main
2068 // register (v0).
2069 const MachineOperand &MODef = Def->getOperand(DefIdx);
2070 // If the result register (Def) and the base register (v0) do not
2071 // have the same register class or if we have to compose
2072 // subregisters, bail out.
2073 if (MRI.getRegClass(MODef.getReg()) != MRI.getRegClass(BaseReg.Reg) ||
2074 BaseReg.SubReg)
2075 return ValueTrackerResult();
2076
2077 // Get the TRI and check if the inserted sub-register overlaps with the
2078 // sub-register we are tracking.
2079 const TargetRegisterInfo *TRI = MRI.getTargetRegisterInfo();
2080 if ((TRI->getSubRegIndexLaneMask(DefSubReg) &
2081 TRI->getSubRegIndexLaneMask(InsertedReg.SubIdx))
2082 .any())
2083 return ValueTrackerResult();
2084 // At this point, the value is available in v0 via the same subreg
2085 // we used for Def.
2086 return ValueTrackerResult(BaseReg.Reg, DefSubReg);
2087}
2088
2089ValueTrackerResult ValueTracker::getNextSourceFromExtractSubreg() {
2090 assert((Def->isExtractSubreg() || Def->isExtractSubregLike()) &&
2091 "Invalid definition");
2092 // We are looking at:
2093 // Def = EXTRACT_SUBREG v0, sub0
2094
2095 // Bail if we have to compose sub registers.
2096 // Indeed, if DefSubReg != 0, we would have to compose it with sub0.
2097 if (DefSubReg)
2098 return ValueTrackerResult();
2099
2100 RegSubRegPairAndIdx ExtractSubregInputReg;
2101 if (!TII->getExtractSubregInputs(*Def, DefIdx, ExtractSubregInputReg))
2102 return ValueTrackerResult();
2103
2104 // Bail if we have to compose sub registers.
2105 // Likewise, if v0.subreg != 0, we would have to compose v0.subreg with sub0.
2106 if (ExtractSubregInputReg.SubReg)
2107 return ValueTrackerResult();
2108 // Otherwise, the value is available in the v0.sub0.
2109 return ValueTrackerResult(ExtractSubregInputReg.Reg,
2110 ExtractSubregInputReg.SubIdx);
2111}
2112
2113ValueTrackerResult ValueTracker::getNextSourceFromSubregToReg() {
2114 assert(Def->isSubregToReg() && "Invalid definition");
2115 // We are looking at:
2116 // Def = SUBREG_TO_REG Imm, v0, sub0
2117
2118 // Bail if we have to compose sub registers.
2119 // If DefSubReg != sub0, we would have to check that all the bits
2120 // we track are included in sub0 and if yes, we would have to
2121 // determine the right subreg in v0.
2122 if (DefSubReg != Def->getOperand(3).getImm())
2123 return ValueTrackerResult();
2124 // Bail if we have to compose sub registers.
2125 // Likewise, if v0.subreg != 0, we would have to compose it with sub0.
2126 if (Def->getOperand(2).getSubReg())
2127 return ValueTrackerResult();
2128
2129 return ValueTrackerResult(Def->getOperand(2).getReg(),
2130 Def->getOperand(3).getImm());
2131}
2132
2133/// Explore each PHI incoming operand and return its sources.
2134ValueTrackerResult ValueTracker::getNextSourceFromPHI() {
2135 assert(Def->isPHI() && "Invalid definition");
2136 ValueTrackerResult Res;
2137
2138 // Return all register sources for PHI instructions.
2139 for (unsigned i = 1, e = Def->getNumOperands(); i < e; i += 2) {
2140 const MachineOperand &MO = Def->getOperand(i);
2141 assert(MO.isReg() && "Invalid PHI instruction");
2142 // We have no code to deal with undef operands. They shouldn't happen in
2143 // normal programs anyway.
2144 if (MO.isUndef())
2145 return ValueTrackerResult();
2146 Res.addSource(MO.getReg(), MO.getSubReg());
2147 }
2148
2149 return Res;
2150}
2151
2152ValueTrackerResult ValueTracker::getNextSourceImpl() {
2153 assert(Def && "This method needs a valid definition");
2154
2155 assert(((Def->getOperand(DefIdx).isDef() &&
2156 (DefIdx < Def->getDesc().getNumDefs() ||
2157 Def->getDesc().isVariadic())) ||
2158 Def->getOperand(DefIdx).isImplicit()) &&
2159 "Invalid DefIdx");
2160 if (Def->isCopy())
2161 return getNextSourceFromCopy();
2162 if (Def->isBitcast())
2163 return getNextSourceFromBitcast();
2164 // All the remaining cases involve "complex" instructions.
2165 // Bail if we did not ask for the advanced tracking.
2167 return ValueTrackerResult();
2168 if (Def->isRegSequence() || Def->isRegSequenceLike())
2169 return getNextSourceFromRegSequence();
2170 if (Def->isInsertSubreg() || Def->isInsertSubregLike())
2171 return getNextSourceFromInsertSubreg();
2172 if (Def->isExtractSubreg() || Def->isExtractSubregLike())
2173 return getNextSourceFromExtractSubreg();
2174 if (Def->isSubregToReg())
2175 return getNextSourceFromSubregToReg();
2176 if (Def->isPHI())
2177 return getNextSourceFromPHI();
2178 return ValueTrackerResult();
2179}
2180
2181ValueTrackerResult ValueTracker::getNextSource() {
2182 // If we reach a point where we cannot move up in the use-def chain,
2183 // there is nothing we can get.
2184 if (!Def)
2185 return ValueTrackerResult();
2186
2187 ValueTrackerResult Res = getNextSourceImpl();
2188 if (Res.isValid()) {
2189 // Update definition, definition index, and subregister for the
2190 // next call of getNextSource.
2191 // Update the current register.
2192 bool OneRegSrc = Res.getNumSources() == 1;
2193 if (OneRegSrc)
2194 Reg = Res.getSrcReg(0);
2195 // Update the result before moving up in the use-def chain
2196 // with the instruction containing the last found sources.
2197 Res.setInst(Def);
2198
2199 // If we can still move up in the use-def chain, move to the next
2200 // definition.
2201 if (!Reg.isPhysical() && OneRegSrc) {
2203 if (DI != MRI.def_end()) {
2204 Def = DI->getParent();
2205 DefIdx = DI.getOperandNo();
2206 DefSubReg = Res.getSrcSubReg(0);
2207 } else {
2208 Def = nullptr;
2209 }
2210 return Res;
2211 }
2212 }
2213 // If we end up here, this means we will not be able to find another source
2214 // for the next iteration. Make sure any new call to getNextSource bails out
2215 // early by cutting the use-def chain.
2216 Def = nullptr;
2217 return Res;
2218}
unsigned SubReg
unsigned const MachineRegisterInfo * MRI
for(const MachineOperand &MO :llvm::drop_begin(OldMI.operands(), Desc.getNumOperands()))
MachineInstrBuilder & UseMI
MachineInstrBuilder MachineInstrBuilder & DefMI
assert(UImm &&(UImm !=~static_cast< T >(0)) &&"Invalid immediate!")
Rewrite undef for PHI
MachineBasicBlock & MBB
This file defines the DenseMap class.
#define DEBUG_TYPE
const HexagonInstrInfo * TII
#define _
IRTranslator LLVM IR MI
A common definition of LaneBitmask for use in TableGen and CodeGen.
#define I(x, y, z)
Definition MD5.cpp:58
TargetInstrInfo::RegSubRegPair RegSubRegPair
Register Reg
Register const TargetRegisterInfo * TRI
Promote Memory to Register
Definition Mem2Reg.cpp:110
MachineInstr unsigned OpIdx
uint64_t IntrinsicInst * II
if(PassOpts->AAPipeline)
#define INITIALIZE_PASS_DEPENDENCY(depName)
Definition PassSupport.h:42
#define INITIALIZE_PASS_END(passName, arg, name, cfg, analysis)
Definition PassSupport.h:44
#define INITIALIZE_PASS_BEGIN(passName, arg, name, cfg, analysis)
Definition PassSupport.h:39
static cl::opt< unsigned > RewritePHILimit("rewrite-phi-limit", cl::Hidden, cl::init(10), cl::desc("Limit the length of PHI chains to lookup"))
static cl::opt< bool > DisablePeephole("disable-peephole", cl::Hidden, cl::init(false), cl::desc("Disable the peephole optimizer"))
static cl::opt< unsigned > MaxRecurrenceChain("recurrence-chain-limit", cl::Hidden, cl::init(3), cl::desc("Maximum length of recurrence chain when evaluating the benefit " "of commuting operands"))
static cl::opt< bool > DisableNAPhysCopyOpt("disable-non-allocatable-phys-copy-opt", cl::Hidden, cl::init(false), cl::desc("Disable non-allocatable physical register copy optimization"))
static bool isVirtualRegisterOperand(MachineOperand &MO)
\bried Returns true if MO is a virtual register operand.
static MachineInstr & insertPHI(MachineRegisterInfo &MRI, const TargetInstrInfo &TII, const SmallVectorImpl< RegSubRegPair > &SrcRegs, MachineInstr &OrigPHI)
Insert a PHI instruction with incoming edges SrcRegs that are guaranteed to have the same register cl...
static cl::opt< bool > Aggressive("aggressive-ext-opt", cl::Hidden, cl::desc("Aggressive extension optimization"))
static cl::opt< bool > DisableAdvCopyOpt("disable-adv-copy-opt", cl::Hidden, cl::init(false), cl::desc("Disable advanced copy optimization"))
Specifiy whether or not the value tracking looks through complex instructions.
TargetInstrInfo::RegSubRegPairAndIdx RegSubRegPairAndIdx
static RegSubRegPair getNewSource(MachineRegisterInfo *MRI, const TargetInstrInfo *TII, RegSubRegPair Def, const PeepholeOptimizer::RewriteMapTy &RewriteMap, bool HandleMultipleSources=true)
Given a Def.Reg and Def.SubReg pair, use RewriteMap to find the new source to use for rewrite.
const SmallVectorImpl< MachineOperand > & Cond
Remove Loads Into Fake Uses
static bool isValid(const char C)
Returns true if C is a valid mangled character: <0-9a-zA-Z_>.
This file defines the SmallPtrSet class.
This file defines the SmallSet class.
This file defines the SmallVector class.
This file defines the 'Statistic' class, which is designed to be an easy way to expose various metric...
#define STATISTIC(VARNAME, DESC)
Definition Statistic.h:171
#define LLVM_DEBUG(...)
Definition Debug.h:114
Virtual Register Rewriter
PassT::Result & getResult(IRUnitT &IR, ExtraArgTs... ExtraArgs)
Get the result of an analysis pass for a given IR unit.
AnalysisUsage & addRequired()
AnalysisUsage & addPreserved()
Add the specified Pass class to the set of analyses preserved by this pass.
LLVM_ABI void setPreservesCFG()
This function should be called by the pass, iff they do not:
Definition Pass.cpp:270
Represents analyses that only rely on functions' control flow.
Definition Analysis.h:73
ValueT lookup(const_arg_type_t< KeyT > Val) const
lookup - Return the entry for the specified key, or a default constructed value if no such entry exis...
Definition DenseMap.h:194
iterator find(const_arg_type_t< KeyT > Val)
Definition DenseMap.h:167
bool erase(const KeyT &Val)
Definition DenseMap.h:311
iterator end()
Definition DenseMap.h:81
std::pair< iterator, bool > insert(const std::pair< KeyT, ValueT > &KV)
Definition DenseMap.h:222
bool analyzeCompare(const MachineInstr &MI, Register &SrcReg, Register &SrcReg2, int64_t &Mask, int64_t &Value) const override
For a comparison instruction, return the source registers in SrcReg and SrcReg2 if having two registe...
bool isLoopHeader(const BlockT *BB) const
unsigned getNumDefs() const
Return the number of MachineOperands that are register definitions.
An RAII based helper class to modify MachineFunctionProperties when running pass.
MachineInstrBundleIterator< MachineInstr > iterator
Analysis pass which computes a MachineDominatorTree.
Analysis pass which computes a MachineDominatorTree.
bool dominates(const MachineInstr *A, const MachineInstr *B) const
MachineFunctionPass - This class adapts the FunctionPass interface to allow convenient creation of pa...
void getAnalysisUsage(AnalysisUsage &AU) const override
getAnalysisUsage - Subclasses that override getAnalysisUsage must call this.
const TargetSubtargetInfo & getSubtarget() const
getSubtarget - Return the subtarget for which this machine code is being compiled.
StringRef getName() const
getName - Return the name of the corresponding LLVM function.
MachineRegisterInfo & getRegInfo()
getRegInfo - Return information about the registers currently in use.
Function & getFunction()
Return the LLVM function that this machine code represents.
void setDelegate(Delegate *delegate)
Set the delegate.
const MachineInstrBuilder & addReg(Register RegNo, unsigned flags=0, unsigned SubReg=0) const
Add a new virtual register operand.
const MachineInstrBuilder & addMBB(MachineBasicBlock *MBB, unsigned TargetFlags=0) const
Representation of each machine instruction.
const MachineBasicBlock * getParent() const
const DebugLoc & getDebugLoc() const
Returns the debug location id of this MachineInstr.
LLVM_ABI void eraseFromParent()
Unlink 'this' from the containing basic block and delete it.
const MachineOperand & getOperand(unsigned i) const
Analysis pass that exposes the MachineLoopInfo for a machine function.
MachineOperand class - Representation of each machine instruction operand.
void setSubReg(unsigned subReg)
unsigned getSubReg() const
bool isReg() const
isReg - Tests if this is a MO_Register operand.
bool isRegMask() const
isRegMask - Tests if this is a MO_RegisterMask operand.
MachineBasicBlock * getMBB() const
LLVM_ABI void setReg(Register Reg)
Change the register this operand corresponds to.
MachineInstr * getParent()
getParent - Return the instruction that this operand belongs to.
void setIsUndef(bool Val=true)
Register getReg() const
getReg - Returns the register number.
static bool clobbersPhysReg(const uint32_t *RegMask, MCRegister PhysReg)
clobbersPhysReg - Returns true if this RegMask clobbers PhysReg.
const uint32_t * getRegMask() const
getRegMask - Returns a bit mask of registers preserved by this RegMask operand.
unsigned getOperandNo() const
getOperandNo - Return the operand # of this MachineOperand in its MachineInstr.
MachineRegisterInfo - Keep track of information for virtual and physical registers,...
defusechain_iterator< false, true, false, true, false > def_iterator
def_iterator/def_begin/def_end - Walk all defs of the specified register.
static LLVM_ABI PassRegistry * getPassRegistry()
getPassRegistry - Access the global registry object, which is automatically initialized at applicatio...
PreservedAnalyses run(MachineFunction &MF, MachineFunctionAnalysisManager &MFAM)
static PreservedAnalyses all()
Construct a special preserved set that preserves all passes.
Definition Analysis.h:118
Wrapper class representing virtual and physical registers.
Definition Register.h:19
constexpr bool isVirtual() const
Return true if the specified register number is in the virtual register namespace.
Definition Register.h:74
constexpr bool isPhysical() const
Return true if the specified register number is in the physical register namespace.
Definition Register.h:78
A templated base class for SmallPtrSet which provides the typesafe interface that is common across al...
bool erase(PtrType Ptr)
Remove pointer from the set.
std::pair< iterator, bool > insert(PtrType Ptr)
Inserts Ptr if and only if there is no element in the container equal to Ptr.
SmallPtrSet - This class implements a set which is optimized for holding SmallSize or less elements.
SmallSet - This maintains a set of unique values, optimizing for the case when the set is small (less...
Definition SmallSet.h:133
size_type count(const T &V) const
count - Return 1 if the element is in the set, 0 otherwise.
Definition SmallSet.h:175
bool empty() const
Definition SmallSet.h:168
bool erase(const T &V)
Definition SmallSet.h:199
std::pair< const_iterator, bool > insert(const T &V)
insert - Insert an element into the set if it isn't already there.
Definition SmallSet.h:183
This class consists of common code factored out of the SmallVector class to reduce code duplication b...
void push_back(const T &Elt)
This is a 'vector' (really, a variable-sized array), optimized for the case when the array is small.
TargetInstrInfo - Interface to description of machine instruction set.
static const unsigned CommuteAnyOperandIndex
virtual const TargetInstrInfo * getInstrInfo() const
virtual const TargetRegisterInfo * getRegisterInfo() const =0
Return the target's register information.
Changed
#define llvm_unreachable(msg)
Marks that the current location is not supposed to be reachable.
MCInstrDesc const & getDesc(MCInstrInfo const &MCII, MCInst const &MCI)
initializer< Ty > init(const Ty &Val)
PointerTypeMap run(const Module &M)
Compute the PointerTypeMap for the module M.
NodeAddr< DefNode * > Def
Definition RDFGraph.h:384
BaseReg
Stack frame base register. Bit 0 of FREInfo.Info.
Definition SFrame.h:77
This is an optimization pass for GlobalISel generic memory operations.
MachineInstrBuilder BuildMI(MachineFunction &MF, const MIMetadata &MIMD, const MCInstrDesc &MCID)
Builder interface. Specify how to create the initial instruction itself.
AnalysisManager< MachineFunction > MachineFunctionAnalysisManager
bool operator==(const AddressRangeValuePair &LHS, const AddressRangeValuePair &RHS)
LLVM_ABI char & PeepholeOptimizerLegacyID
PeepholeOptimizer - This pass performs peephole optimizations - like extension and comparison elimina...
LLVM_ABI PreservedAnalyses getMachineFunctionPassPreservedAnalyses()
Returns the minimum set of Analyses that all machine function passes must preserve.
LLVM_ABI raw_ostream & dbgs()
dbgs() - This returns a reference to a raw_ostream for debugging messages.
Definition Debug.cpp:207
class LLVM_GSL_OWNER SmallVector
Forward declaration of SmallVector so that calculateSmallVectorDefaultInlinedElements can reference s...
@ Other
Any other memory.
Definition ModRef.h:68
LLVM_ABI void initializePeepholeOptimizerLegacyPass(PassRegistry &)
A pair composed of a pair of a register and a sub-register index, and another sub-register index.
A pair composed of a register and a sub-register index.