Skip to content

[RISCV] Simplify one of the RV32 PACK isel patterns. #152045

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

topperc
Copy link
Collaborator

@topperc topperc commented Aug 4, 2025

This pattern previously checked a specific variant of 4 bytes being packed that is generated by unaligned load expansion.

Our individual PACK patterns don't handle that particular case because a DAG combine turns (or (or A, (shl B, 8)), (shl (or C, (shl D, 8)), 16)) into (or (or A, (shl B, 8)), (or (shl C, 16), (shl D, 24)). After this, the outer OR doesn't have a shl operand so we needed a pattern that looks through 2 layers of OR.

To match this pattern we don't need to look at the (or A, (shl B, 8)) part since that part wasn't affected by the DAG combine and can be matched to PACKH by itself. It's enough to make sure that part of the pattern has zeros in the upper 16 bits.

I believe this allows tablegen to automatically generate more permutations of this pattern. The associative and commutative variant expansion is limited to 3 children.

This pattern previously checked a specific variant of 4 bytes
being packed that is generated by unaligned load expansion.

Our simplest PACK patterns misses this case because we don't have
a single shift left by 16. We have two shift lefts hidden behind
another OR.

We only need the pattern to find the 2 shifts in the upper part,
for the lower part we only care that the upper 16 bits are zero.
If the lower bits can also be a PACKH that can be selected separately
after.

I believe this allows tablegen to create more patterns for permutations
of this pattern. The associative and commutative variant expansion
is limited to 3 children.
@llvmbot
Copy link
Member

llvmbot commented Aug 4, 2025

@llvm/pr-subscribers-backend-risc-v

Author: Craig Topper (topperc)

Changes

This pattern previously checked a specific variant of 4 bytes being packed that is generated by unaligned load expansion.

Our simplest PACK patterns misses this case because we don't have a single shift left by 16. We have two shift lefts hidden behind another OR.

We only need the pattern to find the 2 shifts in the upper part, for the lower part we only care that the upper 16 bits are zero. If the lower bits can also be a PACKH that can be selected separately after.

I believe this allows tablegen to create more patterns for permutations of this pattern. The associative and commutative variant expansion is limited to 3 children.


Full diff: https://github.com/llvm/llvm-project/pull/152045.diff

1 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVInstrInfoZb.td (+4-4)
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoZb.td b/llvm/lib/Target/RISCV/RISCVInstrInfoZb.td
index d2a651444169c..94fb23d554f1a 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoZb.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoZb.td
@@ -641,13 +641,13 @@ def : Pat<(binop_allhusers<or> (shl GPR:$rs2, (XLenVT 8)),
 let Predicates = [HasStdExtZbkb, IsRV32] in {
 def : Pat<(i32 (or (zexti16 (i32 GPR:$rs1)), (shl GPR:$rs2, (i32 16)))),
           (PACK GPR:$rs1, GPR:$rs2)>;
+// Match a packh for the high half with a zero extended value in the low half.
+// If the low half also happens to be a packh, it can be matched separately.
 def : Pat<(or (or
                   (shl (zexti8 (XLenVT GPR:$op1rs2)), (XLenVT 24)),
                   (shl (zexti8 (XLenVT GPR:$op1rs1)), (XLenVT 16))),
-              (or
-                  (shl (zexti8 (XLenVT GPR:$op0rs2)), (XLenVT 8)),
-                  (zexti8 (XLenVT GPR:$op0rs1)))),
-          (PACK (XLenVT (PACKH GPR:$op0rs1, GPR:$op0rs2)),
+              (zexti16 (XLenVT GPR:$op0))),
+          (PACK (XLenVT GPR:$op0),
                 (XLenVT (PACKH GPR:$op1rs1, GPR:$op1rs2)))>;
 }
 

@lenary
Copy link
Member

lenary commented Aug 5, 2025

You reference a missed case - can it be turned into a test? I got the impression it could be, but maybe that's wrong?

@topperc
Copy link
Collaborator Author

topperc commented Aug 5, 2025

You reference a missed case - can it be turned into a test? I got the impression it could be, but maybe that's wrong?

The place where I explicitly said "missed case" was trying to explain why Alex needed to add this pattern originally. But it should be possible to test the new commuted+associative permutations from tablegen or just not having a (or A, (shl B, 8)) in the lower 16 bits. I'll try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants