-
Notifications
You must be signed in to change notification settings - Fork 14.7k
[VPlan] Materialize vector trip count using VPInstructions. #151925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -3153,7 +3153,7 @@ void VPlanTransforms::materializeBroadcasts(VPlan &Plan) { | |||||
} | ||||||
} | ||||||
|
||||||
void VPlanTransforms::materializeVectorTripCount( | ||||||
void VPlanTransforms::materializeConstantVectorTripCount( | ||||||
VPlan &Plan, ElementCount BestVF, unsigned BestUF, | ||||||
PredicatedScalarEvolution &PSE) { | ||||||
assert(Plan.hasVF(BestVF) && "BestVF is not available in Plan"); | ||||||
|
@@ -3191,6 +3191,62 @@ void VPlanTransforms::materializeBackedgeTakenCount(VPlan &Plan, | |||||
BTC->replaceAllUsesWith(TCMO); | ||||||
} | ||||||
|
||||||
void VPlanTransforms::materializeVectorTripCount(VPlan &Plan, | ||||||
VPBasicBlock *VectorPHVPBB, | ||||||
bool TailByMasking, | ||||||
bool RequiresScalarEpilogue) { | ||||||
VPValue &VectorTC = Plan.getVectorTripCount(); | ||||||
if (VectorTC.getNumUsers() == 0 || | ||||||
(VectorTC.isLiveIn() && VectorTC.getLiveInIRValue())) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can the VectorTC ever not be a live in at this point? Apart from |
||||||
return; | ||||||
VPValue *TC = Plan.getTripCount(); | ||||||
Type *TCTy = VPTypeAnalysis(Plan).inferScalarType(TC); | ||||||
VPBuilder Builder(VectorPHVPBB, VectorPHVPBB->begin()); | ||||||
|
||||||
VPValue *Step = &Plan.getVFxUF(); | ||||||
|
||||||
// If the tail is to be folded by masking, round the number of iterations N | ||||||
// up to a multiple of Step instead of rounding down. This is done by first | ||||||
// adding Step-1 and then rounding down. Note that it's ok if this addition | ||||||
// overflows: the vector induction variable will eventually wrap to zero given | ||||||
// that it starts at zero and its Step is a power of two; the loop will then | ||||||
// exit, with the last early-exit vector comparison also producing all-true. | ||||||
// For scalable vectors the VF is not guaranteed to be a power of 2, but this | ||||||
// is accounted for in emitIterationCountCheck that adds an overflow check. | ||||||
if (TailByMasking) { | ||||||
TC = Builder.createNaryOp( | ||||||
Instruction::Add, | ||||||
{TC, Builder.createNaryOp( | ||||||
Instruction::Sub, | ||||||
{Step, Plan.getOrAddLiveIn(ConstantInt::get(TCTy, 1))})}, | ||||||
DebugLoc::getUnknown(), "n.rnd.up"); | ||||||
} | ||||||
|
||||||
// Now we need to generate the expression for the part of the loop that the | ||||||
// vectorized body will execute. This is equal to N - (N % Step) if scalar | ||||||
// iterations are not required for correctness, or N - Step, otherwise. Step | ||||||
// is equal to the vectorization factor (number of SIMD elements) times the | ||||||
// unroll factor (number of SIMD instructions). | ||||||
VPValue *R = Builder.createNaryOp(Instruction::URem, {TC, Step}, | ||||||
DebugLoc::getUnknown(), "n.mod.vf"); | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we start annotating these VPlan generated values as compiler generated?
Suggested change
|
||||||
|
||||||
// There are cases where we *must* run at least one iteration in the remainder | ||||||
// loop. See the cost model for when this can happen. If the step evenly | ||||||
// divides the trip count, we set the remainder to be equal to the step. If | ||||||
// the step does not evenly divide the trip count, no adjustment is necessary | ||||||
// since there will already be scalar iterations. Note that the minimum | ||||||
// iterations check ensures that N >= Step. | ||||||
if (RequiresScalarEpilogue) { | ||||||
auto *IsZero = Builder.createICmp( | ||||||
CmpInst::ICMP_EQ, R, Plan.getOrAddLiveIn(ConstantInt::get(TCTy, 0))); | ||||||
R = Builder.createSelect(IsZero, Step, R); | ||||||
} | ||||||
|
||||||
auto Res = Builder.createNaryOp(Instruction::Sub, {TC, R}, | ||||||
DebugLoc::getUnknown(), "n.vec"); | ||||||
Plan.getVectorTripCount().replaceAllUsesWith(Res); | ||||||
} | ||||||
|
||||||
/// Returns true if \p V is VPWidenLoadRecipe or VPInterleaveRecipe that can be | ||||||
/// converted to a narrower recipe. \p V is used by a wide recipe that feeds a | ||||||
/// store interleave group at index \p Idx, \p WideMember0 is the recipe feeding | ||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come we're marking these as public?