Skip to content

x64: avoid load-coalescing SIMD operations with non-aligned loads#3107

Merged
abrown merged 3 commits into
bytecodealliance:mainfrom
abrown:fix-2943
Jul 26, 2021
Merged

x64: avoid load-coalescing SIMD operations with non-aligned loads#3107
abrown merged 3 commits into
bytecodealliance:mainfrom
abrown:fix-2943

Conversation

@abrown
Copy link
Copy Markdown
Member

@abrown abrown commented Jul 21, 2021

Fixes #2943, though not as optimally as may be desired. With x64 SIMD
instructions, the memory operand must be aligned--this change adds that
check. There are cases, however, where we can do better--see #3106.

Fixes bytecodealliance#2943, though not as optimally as may be desired. With x64 SIMD
instructions, the memory operand must be aligned--this change adds that
check. There are cases, however, where we can do better--see bytecodealliance#3106.
@alexcrichton
Copy link
Copy Markdown
Member

One thing I noticed reading your thoughts on #2943 is that as an engine we have to assume that all loads/stores in wasm are unaligned, even if the alignment specified on the memory operation is aligned. The alignment in the memarg is just a hint and the wasm itself doesn't have to uphold the alignment. I suspect, though, that all wasm loads/stores are flagged as unaligned, so this should still fix the issue? (can tests be added?)

I presume that cranelift could still otherwise try to prove that an address is actually aligned, but I would be surprised if that were a cheap or already-implemented analysis...

@github-actions github-actions Bot added cranelift Issues related to the Cranelift code generator cranelift:area:x64 Issues related to x64 codegen labels Jul 21, 2021
@abrown
Copy link
Copy Markdown
Member Author

abrown commented Jul 21, 2021

I'll probably leave this open until @cfallin takes a look: I think this type of change has to happen but maybe he can think of a better way.

Copy link
Copy Markdown
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for this fix!

I agree that we can do better if we have a hint (that comes from the Wasm instruction's alignment hint), separately from the lack of semantically-meaningful actual alignment constraint. I think the right course might be:

  • Add another flag, something like AlignHint, that means "likely to be aligned to natural boundary, but still valid if not"
  • In the Wasm translator, set this if the alignment hint is equal to (or a multiple of) the load size
  • Allow for load-op merging, but generate a special sequence during lowering (with internal control flow) that provides a "trap recovery point" with the unmerged ops
  • Install a SIGBUS handler that redirects execution to the fallback on alignment trap.

The fallback path should ideally be out-of-lined to the bottom of the function (since it should be cold), but that would require some more lowering logic ("set aside this other sequence and emit it at the end").

It also imposes a bit on the runtime, which may (?) have implications for other embedders -- @alexcrichton do you have any thoughts on this?

In the meantime hopefully the perf hit of conservatively not coalescing is not too bad!

@alexcrichton
Copy link
Copy Markdown
Member

Dealing with a signal and handling it I don't think should be too too hard on embedders, there's nothing really different than trap handling I think. That being said I suspect it would be significantly tricky, so I think we'd probably want some motivating data first to see if the optimization is worth it.

@bjorn3
Copy link
Copy Markdown
Contributor

bjorn3 commented Jul 26, 2021

In case of for example cg_clif there is no embedder that can catch the SIGBUS. Now in case of cg_clif most loads/stores are guaranteed to be aligned, so AlignHint is not very useful, but I can imagine that there will be other cases where it may be useful. I think at the very least cranelift-wasm should have an option to disable AlignHint emission.

@abrown abrown merged commit 766774e into bytecodealliance:main Jul 26, 2021
@abrown abrown deleted the fix-2943 branch July 26, 2021 20:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cranelift:area:x64 Issues related to x64 codegen cranelift Issues related to the Cranelift code generator

Projects

None yet

Development

Successfully merging this pull request may close these issues.

wasm simd: folded xor and unaligned load incorrectly "traps"

4 participants