Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion llvm/lib/Transforms/Vectorize/VPlan.h
Original file line number Diff line number Diff line change
Expand Up @@ -2162,7 +2162,9 @@ class VPWidenInductionRecipe : public VPHeaderPHIRecipe {
/// incoming value, its start value.
unsigned getNumIncoming() const override { return 1; }

PHINode *getPHINode() const { return cast<PHINode>(getUnderlyingValue()); }
PHINode *getPHINode() const {
return cast_if_present<PHINode>(getUnderlyingValue());
}

/// Returns the induction descriptor for the recipe.
const InductionDescriptor &getInductionDescriptor() const { return IndDesc; }
Expand Down
156 changes: 145 additions & 11 deletions llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,12 @@

#include "LoopVectorizationPlanner.h"
#include "VPlan.h"
#include "VPlanAnalysis.h"
#include "VPlanCFG.h"
#include "VPlanDominatorTree.h"
#include "VPlanPatternMatch.h"
#include "VPlanTransforms.h"
#include "VPlanUtils.h"
#include "llvm/Analysis/LoopInfo.h"
#include "llvm/Analysis/LoopIterator.h"
#include "llvm/Analysis/ScalarEvolution.h"
Expand Down Expand Up @@ -1120,6 +1122,129 @@ bool VPlanTransforms::handleMaxMinNumReductions(VPlan &Plan) {
return true;
}

/// For argmin/argmax reductions with strict predicates, convert the existing
/// FindLastIV reduction to a new UMin reduction of a wide canonical IV. If the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// FindLastIV reduction to a new UMin reduction of a wide canonical IV. If the
/// FindFirstIV reduction to a new UMin reduction of a wide canonical IV. If the

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the input, it will be a FindLastIV reduction, which we need to convert here

/// original IV was not canonical, a new canonical wide IV is added, and the
/// final result is scaled back to the original IV.
static bool handleFirstArgMinArgMax(VPlan &Plan,
VPReductionPHIRecipe *MinMaxPhiR,
VPReductionPHIRecipe *FindIVPhiR,
VPWidenIntOrFpInductionRecipe *WideIV,
VPInstruction *MinMaxResult) {
Type *Ty = Plan.getVectorLoopRegion()->getCanonicalIVType();
// TODO: Support different IV types.
if (Ty != VPTypeAnalysis(Plan).inferScalarType(FindIVPhiR))
return false;

// If the original wide IV is not canonical, create a new one. The wide IV is
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why create a new one instead of using the loop's CanonicalIV?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The canonical IV will be scalar, we could create a vector value based on it, but in other places we also start with wide IVs, and scalarize them if desirable.

// guaranteed to not wrap for all lanes that are active in the vector loop.
if (!WideIV->isCanonical()) {
VPValue *Zero = Plan.getConstantInt(Ty, 0);
VPValue *One = Plan.getConstantInt(Ty, 1);
auto *WidenCanIV = new VPWidenIntOrFpInductionRecipe(
nullptr, Zero, One, WideIV->getVFValue(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the recipe works with null as its underlying value, perhaps all instances should consistently be constructed w/o an underlying value, and the recipe be slightly renamed VPWideIntOrFpInductionRecipe ("Wide" instead of "Widen").

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I am looking into removing the reliance on the underlying values separately.

WideIV->getInductionDescriptor(),
VPIRFlags::WrapFlagsTy(/*HasNUW=*/true, /*HasNSW=*/false),
WideIV->getDebugLoc());
WidenCanIV->insertBefore(WideIV);

// Update the select to use the wide canonical IV.
auto *SelectR = cast<VPSingleDefRecipe>(
FindIVPhiR->getBackedgeValue()->getDefiningRecipe());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth matching SelectRecipe to make sure it means what we think it means, and assert or return false otherwise?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added assert, thanks

assert(match(SelectR, m_Select(m_VPValue(), m_VPValue(), m_VPValue())) &&
"backedge value must be a select");
WideIV->replaceUsesWithIf(WidenCanIV, [SelectR](const VPUser &U, unsigned) {
return SelectR == &U;
});
}

// Create the new UMin reduction recipe to track the minimum index.
assert(!FindIVPhiR->isInLoop() && !FindIVPhiR->isOrdered() &&
"inloop and ordered reductions not supported");
VPValue *MaxInt =
Plan.getConstantInt(APInt::getMaxValue(Ty->getIntegerBitWidth()));
assert(FindIVPhiR->getVFScaleFactor() == 1 &&
"FindIV reduction must not be scaled");
ReductionStyle Style = RdxUnordered{1};
auto *FirstIdxPhiR = new VPReductionPHIRecipe(
dyn_cast_or_null<PHINode>(FindIVPhiR->getUnderlyingValue()),
RecurKind::UMin, *MaxInt, *FindIVPhiR->getBackedgeValue(), Style,
FindIVPhiR->hasUsesOutsideReductionChain());
FirstIdxPhiR->insertBefore(FindIVPhiR);

VPInstruction *FindLastIVResult =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
VPInstruction *FindLastIVResult =
VPInstruction *FindFirstIVResult =

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the old FindLastrIVResult from the initial FindLastIV reduction

findUserOf<VPInstruction::ComputeFindIVResult>(FindIVPhiR);
MinMaxResult->moveBefore(*FindLastIVResult->getParent(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent: good to have a moveBefore(VPRecipeBase *InsertPos).

FindLastIVResult->getIterator());

// The reduction using MinMaxPhiR needs adjusting to compute the correct
// result:
// 1. Find the first canonical indices corresponding to partial min/max
// values, using loop reductions.
// 2. Find which of the partial min/max values are equal to the overall
// min/max value.
// 3. Select among the canonical indices those corresponding to the overall
// min/max value.
// 4. Find the first canonical index of overall min/max and scale it back to
// the original IV using VPDerivedIVRecipe.
// 5. If the overall min/max is equal to the start value, the condition in
// the
// loop was always false, due to being strict; return the start value in
// that case.
//
// The original reductions need adjusting:
// For example, this transforms
// vp<%min.result> = compute-reduction-result ir<%min.val>, ir<%min.val.next>
// vp<%find.iv.result> = compute-find-iv-result ir<%min.idx>, ir<0>,
// ir<Sentinel>,
// vp<%min.idx.next>
//
// into:
// vp<%min.result> = compute-reduction-result ir<%min.val>, ir<%min.val.next>
// vp<%final.min.cmp> = icmp eq ir<%min.val.next>, vp<%min.result>
// vp<%final.min.idx> = select vp<%final.min.cmp>, ir<%min.idx.next>,
// ir<MaxUInt> vp<%13> = compute-reduction-result ir<%min.idx>,
// vp<%final.min.idx> vp<%scaled.result.iv> = DERIVED-IV ir<20> + vp<%13> *
// ir<1>
// vp<%always.false> = icmp eq vp<%min.result>, ir<%original.min.start>
// vp<%final.result> = select vp<%always.false>, vp<%scaled.result.iv>,
// ir<%original.start>

VPBuilder Builder(FindLastIVResult);
VPValue *MinMaxExiting = MinMaxResult->getOperand(1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: instead of using "MinMax" to mean either one, pick one and explain that everything holds also for the other. As in the example above. Can also use "Extremum", but that's even longer.

auto *FinalMinMaxCmp =
Builder.createICmp(CmpInst::ICMP_EQ, MinMaxExiting, MinMaxResult);
VPValue *LastIVExiting = FindLastIVResult->getOperand(3);
auto *FinalIVSelect =
Builder.createSelect(FinalMinMaxCmp, LastIVExiting, MaxInt);
VPSingleDefRecipe *FinalResult = Builder.createNaryOp(
VPInstruction::ComputeReductionResult, {FirstIdxPhiR, FinalIVSelect}, {},
FindLastIVResult->getDebugLoc());

// If we used a new wide canonical IV convert the reduction result back to the
// original IV scale before the final select.
if (!WideIV->isCanonical()) {
auto *DerivedIVRecipe =
new VPDerivedIVRecipe(InductionDescriptor::IK_IntInduction,
nullptr, // No FPBinOp for integer induction
WideIV->getStartValue(), FinalResult,
WideIV->getStepValue(), "derived.iv.result");
DerivedIVRecipe->insertBefore(&*Builder.getInsertPoint());
FinalResult = DerivedIVRecipe;
}

// If the final min/max value matches the start value, the condition in the
// loop was always false, i.e. no induction value has been selected. If that's
// the case, use the original start value.
VPValue *AlwaysFalse = Builder.createICmp(
CmpInst::ICMP_EQ, MinMaxPhiR->getStartValue(), MinMaxResult);
VPValue *Res = Builder.createSelect(AlwaysFalse, FinalResult,
FindLastIVResult->getOperand(1));
FindIVPhiR->replaceAllUsesWith(FirstIdxPhiR);
FindLastIVResult->replaceAllUsesWith(Res);
return true;
}

bool VPlanTransforms::handleMultiUseReductions(VPlan &Plan) {
for (auto &PhiR : make_early_inc_range(
Plan.getVectorLoopRegion()->getEntryBasicBlock()->phis())) {
Expand All @@ -1131,7 +1256,7 @@ bool VPlanTransforms::handleMultiUseReductions(VPlan &Plan) {
// MinMaxPhiR has users outside the reduction cycle in the loop. Check if
// the only other user is a FindLastIV reduction. MinMaxPhiR must have
// exactly 3 users: 1) the min/max operation, the compare of a FindLastIV
// reduction and ComputeReductionResult. The comparisom must compare
// reduction and ComputeReductionResult. The comparison must compare
// MinMaxPhiR against the min/max operand used for the min/max reduction
// and only be used by the select of the FindLastIV reduction.
RecurKind RdxKind = MinMaxPhiR->getRecurrenceKind();
Expand Down Expand Up @@ -1203,33 +1328,42 @@ bool VPlanTransforms::handleMultiUseReductions(VPlan &Plan) {
FindIVPhiR->getRecurrenceKind()))
return false;

assert(!FindIVPhiR->isInLoop() && !FindIVPhiR->isOrdered() &&
"cannot handle inloop/ordered reductions yet");

// TODO: Support cases where IVOp is the IV increment.
if (!match(IVOp, m_TruncOrSelf(m_VPValue(IVOp))) ||
!isa<VPWidenIntOrFpInductionRecipe>(IVOp))
return false;

CmpInst::Predicate RdxPredicate = [RdxKind]() {
// Check if the predicate is compatible with the reduction kind.
bool IsValidPredicate = [RdxKind, Pred]() {
switch (RdxKind) {
case RecurKind::UMin:
return CmpInst::ICMP_UGE;
return Pred == CmpInst::ICMP_UGE || Pred == CmpInst::ICMP_UGT;
case RecurKind::UMax:
return CmpInst::ICMP_ULE;
return Pred == CmpInst::ICMP_ULE || Pred == CmpInst::ICMP_ULT;
case RecurKind::SMax:
return CmpInst::ICMP_SLE;
return Pred == CmpInst::ICMP_SLE || Pred == CmpInst::ICMP_SLT;
case RecurKind::SMin:
return CmpInst::ICMP_SGE;
return Pred == CmpInst::ICMP_SGE || Pred == CmpInst::ICMP_SGT;
default:
llvm_unreachable("unhandled recurrence kind");
}
}();

// TODO: Strict predicates need to find the first IV value for which the
// predicate holds, not the last.
if (Pred != RdxPredicate)
if (!IsValidPredicate)
return false;

assert(!FindIVPhiR->isInLoop() && !FindIVPhiR->isOrdered() &&
"cannot handle inloop/ordered reductions yet");
// For strict predicates, use a UMin reduction to find the minimum index.
// Canonical IVs (0, 1, 2, ...) are guaranteed not to wrap in the vector
// loop, so UMin can always be used.
bool IsStrictPredicate = ICmpInst::isLT(Pred) || ICmpInst::isGT(Pred);
if (IsStrictPredicate) {
return handleFirstArgMinArgMax(Plan, MinMaxPhiR, FindIVPhiR,
cast<VPWidenIntOrFpInductionRecipe>(IVOp),
MinMaxResult);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be clearer to outline the rest into (else) return handleLastArgMinArgMax()?


// The reduction using MinMaxPhiR needs adjusting to compute the correct
// result:
Expand Down
2 changes: 2 additions & 0 deletions llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -164,6 +164,7 @@ bool VPRecipeBase::mayHaveSideEffects() const {
return cast<VPExpressionRecipe>(this)->mayHaveSideEffects();
case VPDerivedIVSC:
case VPFirstOrderRecurrencePHISC:
case VPReductionPHISC:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be applied independently but tested/matters only now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep

case VPPredInstPHISC:
case VPVectorEndPointerSC:
return false;
Expand Down Expand Up @@ -1185,6 +1186,7 @@ bool VPInstruction::opcodeMayReadOrWriteFromMemory() const {
case VPInstruction::BuildVector:
case VPInstruction::CalculateTripCountMinusVF:
case VPInstruction::CanonicalIVIncrementForPart:
case VPInstruction::ComputeFindIVResult:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be applied independently but tested/matters only now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep

case VPInstruction::ExtractLane:
case VPInstruction::ExtractLastLane:
case VPInstruction::ExtractLastPart:
Expand Down
4 changes: 2 additions & 2 deletions llvm/lib/Transforms/Vectorize/VPlanTransforms.h
Original file line number Diff line number Diff line change
Expand Up @@ -157,8 +157,8 @@ struct VPlanTransforms {
const TargetLibraryInfo &TLI);

/// Try to legalize reductions with multiple in-loop uses. Currently only
/// min/max reductions used by FindLastIV reductions are supported. Otherwise
/// return false.
/// min/max reductions used by FindLastIV and FindFirstIV reductions are
/// supported. Otherwise return false.
static bool handleMultiUseReductions(VPlan &Plan);

/// Try to have all users of fixed-order recurrences appear after the recipe
Expand Down
Loading