Thanks to visit codestin.com
Credit goes to github.com

Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 36 additions & 8 deletions llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3047,13 +3047,6 @@ InstructionCost VPReplicateRecipe::computeCost(ElementCount VF,
// instruction cost.
return 0;
case Instruction::Call: {
if (!isSingleScalar()) {
// TODO: Handle remaining call costs here as well.
if (VF.isScalable())
return InstructionCost::getInvalid();
break;
}

auto *CalledFn =
cast<Function>(getOperand(getNumOperands() - 1)->getLiveInIRValue());
if (CalledFn->isIntrinsic())
Expand All @@ -3063,7 +3056,42 @@ InstructionCost VPReplicateRecipe::computeCost(ElementCount VF,
for (VPValue *ArgOp : drop_end(operands()))
Tys.push_back(Ctx.Types.inferScalarType(ArgOp));
Type *ResultTy = Ctx.Types.inferScalarType(this);
return Ctx.TTI.getCallInstrCost(CalledFn, ResultTy, Tys, Ctx.CostKind);
InstructionCost ScalarCallCost =
Ctx.TTI.getCallInstrCost(CalledFn, ResultTy, Tys, Ctx.CostKind);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the old code was even correct? It looks like in the legacy cost model (LoopVectorizationCostModel::getVectorCallCost) we selected the min of getVectorIntrinsicCost and TTI.getCallInstrCost if we're invoking an intrinsic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current code does not handle intrinsics for now (see bail-out at line 3008).

The twist with intrinsics is that if we chose the intrinsic cost, we only create VPReplicateRecipe for various pseudo-intrinsics. I'll add support for intrinsics as follow-up

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, sorry I missed that! OK great and thanks for doing this.

if (isSingleScalar())
return ScalarCallCost;

if (VF.isScalable())
return InstructionCost::getInvalid();

// Compute the cost of scalarizing the result and operands if needed.
InstructionCost ScalarizationCost = 0;
if (VF.isVector()) {
if (!ResultTy->isVoidTy()) {
for (Type *VectorTy :
to_vector(getContainedTypes(toVectorizedTy(ResultTy, VF)))) {
ScalarizationCost += Ctx.TTI.getScalarizationOverhead(
cast<VectorType>(VectorTy), APInt::getAllOnes(VF.getFixedValue()),
/*Insert=*/true,
/*Extract=*/false, Ctx.CostKind);
}
}
// Skip operands that do not require extraction/scalarization and do not
// incur any overhead.
SmallPtrSet<const VPValue *, 4> UniqueOperands;
Tys.clear();
for (auto *Op : drop_end(operands())) {
if (Op->isLiveIn() || isa<VPReplicateRecipe, VPPredInstPHIRecipe>(Op) ||
!UniqueOperands.insert(Op).second)
continue;
Tys.push_back(toVectorizedTy(Ctx.Types.inferScalarType(Op), VF));
}
ScalarizationCost +=
Ctx.TTI.getOperandsScalarizationOverhead(Tys, Ctx.CostKind);
}

return ScalarCallCost * (isSingleScalar() ? 1 : VF.getFixedValue()) +
ScalarizationCost;
Comment on lines +3093 to +3094
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could be mis-reading the indentation here, but if it was single scalar, wouldn't this case have exited early at line 2992?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, removed, thanks

}
case Instruction::Add:
case Instruction::Sub:
Expand Down