Skip to content

x64: Update codegen of XmmCmove pseudo-inst#10839

Merged
cfallin merged 1 commit into
bytecodealliance:mainfrom
alexcrichton:x64-new-xmm-cmove
May 27, 2025
Merged

x64: Update codegen of XmmCmove pseudo-inst#10839
cfallin merged 1 commit into
bytecodealliance:mainfrom
alexcrichton:x64-new-xmm-cmove

Conversation

@alexcrichton
Copy link
Copy Markdown
Member

In #4317 this instruction was updated to handle 128-bit vectors in addition to the previous handling of 32/64-bit floats. Originally the pseudo-instruction used movs{s,d} to achieve its task and when adding 128-bit support I mistakenly switched both f32/f64 paths to using movsd instead of conditionally using movss for f32. In retrospect though it's probably best to use a full register move here instead of just a singular mov because movss and movsd preserve the upper bits of the register, needlessly creating a data dependency with the previous value in the register.

This commit updates this helper to using Inst::gen_move which already internally does this optimization of using movaps, a documented zero-latency instruction, for all xmm-style register movements.

In bytecodealliance#4317 this instruction was updated to handle 128-bit vectors in
addition to the previous handling of 32/64-bit floats. Originally the
pseudo-instruction used `movs{s,d}` to achieve its task and when adding
128-bit support I mistakenly switched both f32/f64 paths to using
`movsd` instead of conditionally using `movss` for `f32`. In retrospect
though it's probably best to use a full register move here instead of
just a singular mov because `movss` and `movsd` preserve the upper bits
of the register, needlessly creating a data dependency with the previous
value in the register.

This commit updates this helper to using `Inst::gen_move` which already
internally does this optimization of using `movaps`, a documented
zero-latency instruction, for all xmm-style register movements.
@alexcrichton alexcrichton requested review from a team as code owners May 27, 2025 15:48
@alexcrichton alexcrichton requested review from cfallin and dicej and removed request for a team May 27, 2025 15:48
@cfallin cfallin added this pull request to the merge queue May 27, 2025
Merged via the queue into bytecodealliance:main with commit 55c6e16 May 27, 2025
53 checks passed
@alexcrichton alexcrichton deleted the x64-new-xmm-cmove branch May 27, 2025 16:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants