Thanks to visit codestin.com
Credit goes to github.com

Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 19 additions & 12 deletions src/coreclr/jit/lower.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1150,10 +1150,6 @@ GenTree* Lowering::LowerSwitch(GenTree* node)
bool Lowering::TryLowerSwitchToBitTest(
BasicBlock* jumpTable[], unsigned jumpCount, unsigned targetCount, BasicBlock* bbSwitch, GenTree* switchValue)
{
#ifndef TARGET_XARCH
// Other architectures may use this if they substitute GT_BT with equivalent code.
return false;
#else
assert(jumpCount >= 2);
assert(targetCount >= 2);
assert(bbSwitch->bbJumpKind == BBJ_SWITCH);
Expand Down Expand Up @@ -1223,7 +1219,7 @@ bool Lowering::TryLowerSwitchToBitTest(
return false;
}

#ifdef TARGET_64BIT
#if defined(TARGET_64BIT) && defined(TARGET_XARCH)
//
// See if we can avoid a 8 byte immediate on 64 bit targets. If all upper 32 bits are 1
// then inverting the bit table will make them 0 so that the table now fits in 32 bits.
Expand Down Expand Up @@ -1270,20 +1266,31 @@ bool Lowering::TryLowerSwitchToBitTest(
comp->fgAddRefPred(bbCase1, bbSwitch);
}

var_types bitTableType = (bitCount <= (genTypeSize(TYP_INT) * 8)) ? TYP_INT : TYP_LONG;
GenTree* bitTableIcon = comp->gtNewIconNode(bitTable, bitTableType);

#ifdef TARGET_XARCH
//
// Append BT(bitTable, switchValue) and JCC(condition) to the switch block.
//

var_types bitTableType = (bitCount <= (genTypeSize(TYP_INT) * 8)) ? TYP_INT : TYP_LONG;
GenTree* bitTableIcon = comp->gtNewIconNode(bitTable, bitTableType);
GenTree* bitTest = comp->gtNewOperNode(GT_BT, TYP_VOID, bitTableIcon, switchValue);
GenTree* bitTest = comp->gtNewOperNode(GT_BT, TYP_VOID, bitTableIcon, switchValue);
bitTest->gtFlags |= GTF_SET_FLAGS;
GenTreeCC* jcc = comp->gtNewCC(GT_JCC, TYP_VOID, bbSwitchCondition);

LIR::AsRange(bbSwitch).InsertAfter(switchValue, bitTableIcon, bitTest, jcc);

#else // TARGET_XARCH
//
// Fallback to AND(RSZ(bitTable, switchValue), 1)
//
GenTree* tstCns = comp->gtNewIconNode(bbSwitch->bbNext != bbCase0 ? 0 : 1, bitTableType);
GenTree* shift = comp->gtNewOperNode(GT_RSZ, bitTableType, bitTableIcon, switchValue);
GenTree* one = comp->gtNewIconNode(1, bitTableType);
GenTree* andOp = comp->gtNewOperNode(GT_AND, bitTableType, shift, one);
GenTree* cmp = comp->gtNewOperNode(GT_EQ, TYP_INT, andOp, tstCns);
GenTree* jcc = comp->gtNewOperNode(GT_JTRUE, TYP_VOID, cmp);
LIR::AsRange(bbSwitch).InsertAfter(switchValue, bitTableIcon, shift, tstCns, one);
LIR::AsRange(bbSwitch).InsertAfter(one, andOp, cmp, jcc);
#endif // !TARGET_XARCH
Comment on lines +1272 to +1292
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we still need the xarch special case? Does OptimzeConstCompare not handle the case? Ideally we would teach it about the missing opportunity so that everyone benefits instead of special casing it here

Copy link
Member Author

@EgorBo EgorBo Sep 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about it, but I'd like to leave it for a future follow up since it involves more work to add that peephole and I was mainly interested in improving arm64

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you mean that the peephole already exists?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, sounds fine to me. I would personally expect x & (1 << y) to be quite common, and it looks like we are missing this opportunity for arm64 (TEST_EQ/TEST_NE(x, LSH(1, y)) => TEST_EQ/TEST_NE(RSZ(x, y), 1)). Then this transform could produce EQ/NE(AND(x, LSH(1, y)))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it exists, but it seems like adding it here would be straightforward:

#ifdef TARGET_XARCH
if (cmp->OperIs(GT_TEST_EQ, GT_TEST_NE))
{
//
// Transform TEST_EQ|NE(x, LSH(1, y)) into BT(x, y) when possible. Using BT
// results in smaller and faster code. It also doesn't have special register
// requirements, unlike LSH that requires the shift count to be in ECX.
// Note that BT has the same behavior as LSH when the bit index exceeds the
// operand bit size - it uses (bit_index MOD bit_size).
//
GenTree* lsh = cmp->gtGetOp2();
if (lsh->OperIs(GT_LSH) && varTypeIsIntOrI(lsh->TypeGet()) && lsh->gtGetOp1()->IsIntegralConst(1))
{
cmp->SetOper(cmp->OperIs(GT_TEST_EQ) ? GT_BITTEST_EQ : GT_BITTEST_NE);
cmp->AsOp()->gtOp2 = lsh->gtGetOp2();
cmp->gtGetOp2()->ClearContained();
BlockRange().Remove(lsh->gtGetOp1());
BlockRange().Remove(lsh);
return cmp->gtNext;
}
}
#endif // TARGET_XARCH

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course we could also teach it about TEST_EQ/TEST_NE(RSZ(x, y), 1) => BT(x, y) on x64/x86, but I guess this is a less common pattern

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway I'm ok with keeping this PR as is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakobbotsch ok let's then keep as is and I'll file a "good-first-issue" to recognize BT pattern (or will work myself when I have time)

return true;
#endif // TARGET_XARCH
}

void Lowering::ReplaceArgWithPutArgOrBitcast(GenTree** argSlot, GenTree* putArgOrBitcast)
Expand Down