riscv64: Add `extractlane` and `splat` instructions by afonso360 · Pull Request #6397 · bytecodealliance/wasmtime

afonso360 · 2023-05-17T12:37:18Z

👋 Hey,

This PR implements both extractlane and splat.

splat is fairly simple in that we have a bunch of move instructions that by default splat the source X or F register into the vector register. These are vmv.v.x, vfmv.v.f and vmv.v.i, for X, F and immediate sources. The only noteworthy thing about these instructions is that they have weird encodings, and I've added two new instruction formats to deal with this. vmv.v.i has no source operands, only a destination register. And the other ones could maybe be encoded using the existing Imm5 instruction format it was becoming a bit weird keeping all of the variations together.

For extractlane we have two additional move instructions vfmv.f.s and vmv.x.s these move element 0 of the source vector into the appropriate X or F register. Additionally for extracting other elements we use vslidedown that moves all elements of a vector down by n positions and then emit the appropriate move into the destination register.

These instructions move values from vectors into other register types and vice-versa.

github-actions · 2023-05-17T12:50:07Z

Subscribe to Label Action

cc @cfallin, @fitzgen

Details

This issue or pull request has been labeled: "cranelift", "isle"

Thus the following users have been cc'd because of the following labels:

cfallin: isle
fitzgen: isle

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

alexcrichton

Nice!

alexcrichton · 2023-05-17T14:25:50Z

+;;;; Multi-Instruction Helpers ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+(decl gen_extractlane (Type Reg u8) Reg)
+
+;; When extracting lane 0 for floats, we can use `vfmv.f.s` directly.
+(rule 3 (gen_extractlane (ty_vec_fits_in_register ty) src 0)
+  (if (ty_vector_float ty))
+  (rv_vfmv_fs src ty))
+
+;; When extracting lane 0 for integers, we can use `vmv.x.s` directly.
+(rule 2 (gen_extractlane (ty_vec_fits_in_register ty) src 0)
+  (if (ty_vector_not_float ty))
+  (rv_vmv_xs src ty))
+
+;; In the general case, we must first use a `vslidedown` to place the correct lane
+;; in index 0, and then use the appropriate `vmv` instruction.
+;; If the index fits into a 5-bit immediate, we can emit a `vslidedown.vi`.
+(rule 1 (gen_extractlane (ty_vec_fits_in_register ty) src (uimm5_from_u8 idx))
+  (gen_extractlane ty (rv_vslidedown_vi src idx ty) 0))
+
+;; Otherwise lower it into an X register.
+(rule 0 (gen_extractlane (ty_vec_fits_in_register ty) src idx)
+  (gen_extractlane ty (rv_vslidedown_vx src (imm $I64 idx) ty) 0))


Out of curiosity, is there a particular motivation for having this helper here vs inlining it into the lowering of extractlane?

Mostly to avoid having two rules for float and integer in the slide down cases. vslidedown is generic across integer and float elements, but vmv is not, so we recursively call this rule to decide the correct vmv instruction to use.

That being said, we can probably inline the vslidedown rules and have just a generic vmv that decides the correct instruction based on the type.

Oh sorry to clarify I mean that you've got 4 cases of gen_extractlane here, but why not have those 4 cases be cases on lower (extractlane ..)?

If I inline it directly into (lower (extractlane ..)) then it would become, 6 rules, right? Since I would have to duplicate vslidedown.vi+vmv, vslidedown.vi+vfm, vslidedown.vx+vmv and vslidedown.vx+vfm, which are currently just 2 rules.

Or can I then recursively call the lowering rules for extractlane directly?

But that was basically the reasoning, I could avoid 2 rules, this way.

Ah I apologize I missed that crucial bit of this being a recursive rule! In that case yeah definitely makes sense as a standalone decl.

alexcrichton · 2023-05-17T14:26:47Z

+;; TODO: We can splat out more patterns by using for example a vmv.v.i i8x16 for
+;; a i64x2 const with a compatible bit pattern.


IIRC the aarch64 backend does some trickery along these lines where it iteratively halves the size of a constant if it's splatted, which may serve as good inspiration for supporting this.

…#6397) * riscv64: Add `vslidedown.v{x,i}` instructions * riscv64: Add `v{f,}mv` instructions These instructions move values from vectors into other register types and vice-versa. * riscv64: Add `extractlane` lowerings * riscv64: Add `vmv.v.*` instructions * riscv64: Implement `splat` * riscv64: Add `vmv.v.i` instruction * riscv64: Remove unused `imm5_zero` * wasmtime: Enable more RISC-V SIMD tests * cranelift: Enable ssse3 tests for `fadd-splat` testsuite * riscv64: Update splat TODO comment

afonso360 added 9 commits May 17, 2023 12:09

riscv64: Add vslidedown.v{x,i} instructions

dafdc6e

riscv64: Add v{f,}mv instructions

53993b4

These instructions move values from vectors into other register types and vice-versa.

riscv64: Add extractlane lowerings

6b74377

riscv64: Add vmv.v.* instructions

d901bba

riscv64: Implement splat

14f5988

riscv64: Add vmv.v.i instruction

be173e2

riscv64: Remove unused imm5_zero

ca455b8

wasmtime: Enable more RISC-V SIMD tests

4067f9f

cranelift: Enable ssse3 tests for fadd-splat testsuite

2091de6

afonso360 requested review from a team as code owners May 17, 2023 12:37

afonso360 requested review from elliottt and removed request for a team May 17, 2023 12:37

github-actions Bot added cranelift Issues related to the Cranelift code generator isle Related to the ISLE domain-specific language labels May 17, 2023

alexcrichton approved these changes May 17, 2023

View reviewed changes

riscv64: Update splat TODO comment

d975d0b

afonso360 enabled auto-merge May 17, 2023 18:33

afonso360 added this pull request to the merge queue May 17, 2023

Merged via the queue into bytecodealliance:main with commit 752c7ea May 17, 2023

afonso360 deleted the riscv-extract-splat branch May 17, 2023 19:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

riscv64: Add `extractlane` and `splat` instructions#6397

riscv64: Add `extractlane` and `splat` instructions#6397
afonso360 merged 10 commits into
bytecodealliance:mainfrom
afonso360:riscv-extract-splat

afonso360 commented May 17, 2023

Uh oh!

github-actions Bot commented May 17, 2023

Uh oh!

alexcrichton left a comment

Uh oh!

alexcrichton May 17, 2023

Uh oh!

afonso360 May 17, 2023

Uh oh!

alexcrichton May 17, 2023

Uh oh!

afonso360 May 17, 2023 •

edited

Loading

Uh oh!

alexcrichton May 17, 2023

Uh oh!

alexcrichton May 17, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		;; TODO: We can splat out more patterns by using for example a vmv.v.i i8x16 for
		;; a i64x2 const with a compatible bit pattern.

Conversation

afonso360 commented May 17, 2023

Uh oh!

github-actions Bot commented May 17, 2023

Subscribe to Label Action

Uh oh!

alexcrichton left a comment

Choose a reason for hiding this comment

Uh oh!

alexcrichton May 17, 2023

Choose a reason for hiding this comment

Uh oh!

afonso360 May 17, 2023

Choose a reason for hiding this comment

Uh oh!

alexcrichton May 17, 2023

Choose a reason for hiding this comment

Uh oh!

afonso360 May 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alexcrichton May 17, 2023

Choose a reason for hiding this comment

Uh oh!

alexcrichton May 17, 2023

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

afonso360 May 17, 2023 •

edited

Loading