riscv64: Add extractlane and splat instructions#6397
Conversation
These instructions move values from vectors into other register types and vice-versa.
Subscribe to Label ActionDetailsThis issue or pull request has been labeled: "cranelift", "isle"Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
| ;;;; Multi-Instruction Helpers ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; | ||
|
|
||
| (decl gen_extractlane (Type Reg u8) Reg) | ||
|
|
||
| ;; When extracting lane 0 for floats, we can use `vfmv.f.s` directly. | ||
| (rule 3 (gen_extractlane (ty_vec_fits_in_register ty) src 0) | ||
| (if (ty_vector_float ty)) | ||
| (rv_vfmv_fs src ty)) | ||
|
|
||
| ;; When extracting lane 0 for integers, we can use `vmv.x.s` directly. | ||
| (rule 2 (gen_extractlane (ty_vec_fits_in_register ty) src 0) | ||
| (if (ty_vector_not_float ty)) | ||
| (rv_vmv_xs src ty)) | ||
|
|
||
| ;; In the general case, we must first use a `vslidedown` to place the correct lane | ||
| ;; in index 0, and then use the appropriate `vmv` instruction. | ||
| ;; If the index fits into a 5-bit immediate, we can emit a `vslidedown.vi`. | ||
| (rule 1 (gen_extractlane (ty_vec_fits_in_register ty) src (uimm5_from_u8 idx)) | ||
| (gen_extractlane ty (rv_vslidedown_vi src idx ty) 0)) | ||
|
|
||
| ;; Otherwise lower it into an X register. | ||
| (rule 0 (gen_extractlane (ty_vec_fits_in_register ty) src idx) | ||
| (gen_extractlane ty (rv_vslidedown_vx src (imm $I64 idx) ty) 0)) |
There was a problem hiding this comment.
Out of curiosity, is there a particular motivation for having this helper here vs inlining it into the lowering of extractlane?
There was a problem hiding this comment.
Mostly to avoid having two rules for float and integer in the slide down cases. vslidedown is generic across integer and float elements, but vmv is not, so we recursively call this rule to decide the correct vmv instruction to use.
That being said, we can probably inline the vslidedown rules and have just a generic vmv that decides the correct instruction based on the type.
There was a problem hiding this comment.
Oh sorry to clarify I mean that you've got 4 cases of gen_extractlane here, but why not have those 4 cases be cases on lower (extractlane ..)?
There was a problem hiding this comment.
If I inline it directly into (lower (extractlane ..)) then it would become, 6 rules, right? Since I would have to duplicate vslidedown.vi+vmv, vslidedown.vi+vfm, vslidedown.vx+vmv and vslidedown.vx+vfm, which are currently just 2 rules.
Or can I then recursively call the lowering rules for extractlane directly?
But that was basically the reasoning, I could avoid 2 rules, this way.
There was a problem hiding this comment.
Ah I apologize I missed that crucial bit of this being a recursive rule! In that case yeah definitely makes sense as a standalone decl.
| ;; TODO: We can splat out more patterns by using for example a vmv.v.i i8x16 for | ||
| ;; a i64x2 const with a compatible bit pattern. |
There was a problem hiding this comment.
IIRC the aarch64 backend does some trickery along these lines where it iteratively halves the size of a constant if it's splatted, which may serve as good inspiration for supporting this.
…#6397) * riscv64: Add `vslidedown.v{x,i}` instructions * riscv64: Add `v{f,}mv` instructions These instructions move values from vectors into other register types and vice-versa. * riscv64: Add `extractlane` lowerings * riscv64: Add `vmv.v.*` instructions * riscv64: Implement `splat` * riscv64: Add `vmv.v.i` instruction * riscv64: Remove unused `imm5_zero` * wasmtime: Enable more RISC-V SIMD tests * cranelift: Enable ssse3 tests for `fadd-splat` testsuite * riscv64: Update splat TODO comment
👋 Hey,
This PR implements both
extractlaneandsplat.splatis fairly simple in that we have a bunch of move instructions that by default splat the source X or F register into the vector register. These arevmv.v.x,vfmv.v.fandvmv.v.i, for X, F and immediate sources. The only noteworthy thing about these instructions is that they have weird encodings, and I've added two new instruction formats to deal with this.vmv.v.ihas no source operands, only a destination register. And the other ones could maybe be encoded using the existing Imm5 instruction format it was becoming a bit weird keeping all of the variations together.For
extractlanewe have two additional move instructionsvfmv.f.sandvmv.x.sthese move element 0 of the source vector into the appropriate X or F register. Additionally for extracting other elements we usevslidedownthat moves all elements of a vector down by n positions and then emit the appropriate move into the destination register.