Skip to content

x64: Improve codegen for splats#6025

Merged
alexcrichton merged 1 commit into
bytecodealliance:mainfrom
alexcrichton:splat
Mar 15, 2023
Merged

x64: Improve codegen for splats#6025
alexcrichton merged 1 commit into
bytecodealliance:mainfrom
alexcrichton:splat

Conversation

@alexcrichton
Copy link
Copy Markdown
Member

This commit goes through the lowerings for the CLIF splat instruction and improves the support for each operator. Many of these lowerings are mirrored from v8/SpiderMonkey and there are a number of improvements:

  • AVX2 v{p,}broadcast* instructions are added and used when available.
  • Float-based splats are much simpler and always a single-instruction
  • Integer-based splats don't insert into an uninit xmm value and instead start out with a movd to move into an xmm register. This thoeretically breaks dependencies with prior instructions since movd creates a fresh new value in the destination register.
  • Loads are now sunk into all of the instructions. A new extractor, sinkable_load_exact, was added to sink the i8/i16 loads.

@github-actions github-actions Bot added cranelift Issues related to the Cranelift code generator cranelift:area:x64 Issues related to x64 codegen isle Related to the ISLE domain-specific language labels Mar 15, 2023
@github-actions
Copy link
Copy Markdown

Subscribe to Label Action

cc @cfallin, @fitzgen

Details This issue or pull request has been labeled: "cranelift", "cranelift:area:x64", "isle"

Thus the following users have been cc'd because of the following labels:

  • cfallin: isle
  • fitzgen: isle

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

Copy link
Copy Markdown
Member

@fitzgen fitzgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@fitzgen fitzgen added this pull request to the merge queue Mar 15, 2023
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to a conflict with the base branch Mar 15, 2023
This commit goes through the lowerings for the CLIF `splat` instruction
and improves the support for each operator. Many of these lowerings are
mirrored from v8/SpiderMonkey and there are a number of improvements:

* AVX2 `v{p,}broadcast*` instructions are added and used when available.
* Float-based splats are much simpler and always a single-instruction
* Integer-based splats don't insert into an uninit xmm value and instead
  start out with a `movd` to move into an `xmm` register. This
  thoeretically breaks dependencies with prior instructions since `movd`
  creates a fresh new value in the destination register.
* Loads are now sunk into all of the instructions. A new extractor,
  `sinkable_load_exact`, was added to sink the i8/i16 loads.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cranelift:area:x64 Issues related to x64 codegen cranelift Issues related to the Cranelift code generator isle Related to the ISLE domain-specific language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants