Skip to content

riscv64: Add extractlane and splat instructions#6397

Merged
afonso360 merged 10 commits into
bytecodealliance:mainfrom
afonso360:riscv-extract-splat
May 17, 2023
Merged

riscv64: Add extractlane and splat instructions#6397
afonso360 merged 10 commits into
bytecodealliance:mainfrom
afonso360:riscv-extract-splat

Conversation

@afonso360
Copy link
Copy Markdown
Contributor

👋 Hey,

This PR implements both extractlane and splat.

splat is fairly simple in that we have a bunch of move instructions that by default splat the source X or F register into the vector register. These are vmv.v.x, vfmv.v.f and vmv.v.i, for X, F and immediate sources. The only noteworthy thing about these instructions is that they have weird encodings, and I've added two new instruction formats to deal with this. vmv.v.i has no source operands, only a destination register. And the other ones could maybe be encoded using the existing Imm5 instruction format it was becoming a bit weird keeping all of the variations together.

For extractlane we have two additional move instructions vfmv.f.s and vmv.x.s these move element 0 of the source vector into the appropriate X or F register. Additionally for extracting other elements we use vslidedown that moves all elements of a vector down by n positions and then emit the appropriate move into the destination register.

@afonso360 afonso360 requested review from a team as code owners May 17, 2023 12:37
@afonso360 afonso360 requested review from elliottt and removed request for a team May 17, 2023 12:37
@github-actions github-actions Bot added cranelift Issues related to the Cranelift code generator isle Related to the ISLE domain-specific language labels May 17, 2023
@github-actions
Copy link
Copy Markdown

Subscribe to Label Action

cc @cfallin, @fitzgen

Details This issue or pull request has been labeled: "cranelift", "isle"

Thus the following users have been cc'd because of the following labels:

  • cfallin: isle
  • fitzgen: isle

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

Copy link
Copy Markdown
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Comment on lines +353 to +375
;;;; Multi-Instruction Helpers ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

(decl gen_extractlane (Type Reg u8) Reg)

;; When extracting lane 0 for floats, we can use `vfmv.f.s` directly.
(rule 3 (gen_extractlane (ty_vec_fits_in_register ty) src 0)
(if (ty_vector_float ty))
(rv_vfmv_fs src ty))

;; When extracting lane 0 for integers, we can use `vmv.x.s` directly.
(rule 2 (gen_extractlane (ty_vec_fits_in_register ty) src 0)
(if (ty_vector_not_float ty))
(rv_vmv_xs src ty))

;; In the general case, we must first use a `vslidedown` to place the correct lane
;; in index 0, and then use the appropriate `vmv` instruction.
;; If the index fits into a 5-bit immediate, we can emit a `vslidedown.vi`.
(rule 1 (gen_extractlane (ty_vec_fits_in_register ty) src (uimm5_from_u8 idx))
(gen_extractlane ty (rv_vslidedown_vi src idx ty) 0))

;; Otherwise lower it into an X register.
(rule 0 (gen_extractlane (ty_vec_fits_in_register ty) src idx)
(gen_extractlane ty (rv_vslidedown_vx src (imm $I64 idx) ty) 0))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, is there a particular motivation for having this helper here vs inlining it into the lowering of extractlane?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly to avoid having two rules for float and integer in the slide down cases. vslidedown is generic across integer and float elements, but vmv is not, so we recursively call this rule to decide the correct vmv instruction to use.

That being said, we can probably inline the vslidedown rules and have just a generic vmv that decides the correct instruction based on the type.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry to clarify I mean that you've got 4 cases of gen_extractlane here, but why not have those 4 cases be cases on lower (extractlane ..)?

Copy link
Copy Markdown
Contributor Author

@afonso360 afonso360 May 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I inline it directly into (lower (extractlane ..)) then it would become, 6 rules, right? Since I would have to duplicate vslidedown.vi+vmv, vslidedown.vi+vfm, vslidedown.vx+vmv and vslidedown.vx+vfm, which are currently just 2 rules.

Or can I then recursively call the lowering rules for extractlane directly?

But that was basically the reasoning, I could avoid 2 rules, this way.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I apologize I missed that crucial bit of this being a recursive rule! In that case yeah definitely makes sense as a standalone decl.

Comment on lines +1050 to +1051
;; TODO: We can splat out more patterns by using for example a vmv.v.i i8x16 for
;; a i64x2 const with a compatible bit pattern.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC the aarch64 backend does some trickery along these lines where it iteratively halves the size of a constant if it's splatted, which may serve as good inspiration for supporting this.

@afonso360 afonso360 enabled auto-merge May 17, 2023 18:33
@afonso360 afonso360 added this pull request to the merge queue May 17, 2023
Merged via the queue into bytecodealliance:main with commit 752c7ea May 17, 2023
@afonso360 afonso360 deleted the riscv-extract-splat branch May 17, 2023 19:56
salewski pushed a commit to salewski/wasmtime that referenced this pull request Jun 30, 2023
…#6397)

* riscv64: Add `vslidedown.v{x,i}` instructions

* riscv64: Add `v{f,}mv` instructions

These instructions move values from vectors into other register types and vice-versa.

* riscv64: Add `extractlane` lowerings

* riscv64: Add `vmv.v.*` instructions

* riscv64: Implement `splat`

* riscv64: Add `vmv.v.i` instruction

* riscv64: Remove unused `imm5_zero`

* wasmtime: Enable more RISC-V SIMD tests

* cranelift: Enable ssse3 tests for `fadd-splat` testsuite

* riscv64: Update splat TODO comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cranelift Issues related to the Cranelift code generator isle Related to the ISLE domain-specific language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants