Fix some i128 shift-related bugs in x64 backend.#2682
Conversation
This fixes bytecodealliance#2672 and bytecodealliance#2679, and also fixes an incorrect instruction emission (`test` with small immediate) that we had missed earlier. The shift-related fixes have to do with (i) shifts by 0 bits, as a special case that must be handled; and (ii) shifts by a 128-bit amount, which we can handle by just dropping the upper half (we only use 3--7 bits of shift amount). This adjusts the lowerings appropriately, and also adds run-tests to ensure that the lowerings actually execute correctly (previously we only had compile-tests with golden lowerings; I'd like to correct this for more ops eventually, adding run-tests beyond what the Wasm spec and frontend covers).
It didn't help unfortunately. |
Add a bunch of test vectors that actually expose this (previously the shift-by-zero test had equal lower and upper halves and hid the bug), including the most basic of all, 1 << 0 == 1 (thanks @bjorn3 for finding this).
|
Ah, yes, you were right; I confused |
bjorn3
left a comment
There was a problem hiding this comment.
Thanks! This version works.
abrown
left a comment
There was a problem hiding this comment.
I think this looks fine. (As an aside, why aren't we using SSE2's PSLLDQ/PSRLDQ instructions instead of these long sequences? I haven't looked at much of the i128 code but it would seem that moving upper and lower halves to XMMs and back might still be faster for one of these cases?)
Ah, the simple answer is that I don't know SSE well enough to reach for such instructions -- though it looks like they should work much more efficiently than these sequences! I'll go ahead and merge with your +1 for now so that we have correct results; but we can definitely improve this later. Thanks! |
This fixes #2672 and #2679, and also fixes an incorrect instruction
emission (
testwith small immediate) that we had missed earlier.The shift-related fixes have to do with (i) shifts by 0 bits, as a
special case that must be handled; and (ii) shifts by a 128-bit amount,
which we can handle by just dropping the upper half (we only use 3--7
bits of shift amount).
This adjusts the lowerings appropriately, and also adds run-tests to
ensure that the lowerings actually execute correctly (previously we only
had compile-tests with golden lowerings; I'd like to correct this for
more ops eventually, adding run-tests beyond what the Wasm spec and
frontend covers).