Skip to content

cranelift: Optimize select_spectre_guard, carefully#8139

Merged
jameysharp merged 2 commits into
bytecodealliance:mainfrom
jameysharp:optimize-spectre
Mar 20, 2024
Merged

cranelift: Optimize select_spectre_guard, carefully#8139
jameysharp merged 2 commits into
bytecodealliance:mainfrom
jameysharp:optimize-spectre

Conversation

@jameysharp
Copy link
Copy Markdown
Contributor

This commit makes two changes to our treatment of
select_spectre_guard.

First, stop annotating this instruction as having any side effects. We only care that if its value result is used, then it's computed without branching on the condition input. We don't otherwise care when the value is computed, or if it's computed at all.

Second, introduce some carefully selected ISLE egraph rewrites for this instruction. These particular rewrites are those where we can statically determine which SSA value will be the result of the instruction. Since there is no actual choice involved, there's no way to accidentally introduce a branch on the condition input.

@jameysharp jameysharp requested a review from a team as a code owner March 14, 2024 20:45
@jameysharp jameysharp requested review from fitzgen and removed request for a team March 14, 2024 20:45
@jameysharp
Copy link
Copy Markdown
Contributor Author

Perhaps @cfallin should review this? I was talking about it with him yesterday.

@cfallin cfallin self-requested a review March 14, 2024 21:13
@cfallin
Copy link
Copy Markdown
Member

cfallin commented Mar 14, 2024

Sure, happy to take a look.

Copy link
Copy Markdown
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rewrite looks good to me; thanks for thinking through and discussing this yesterday!

A few things I think we'll want:

  • Could we have an egraph filetest showing each case?
  • I think it would be good to document (or update documentation about) the semantics of select_spectre_guard and the corresponding justification, a little more than the comments here. In particular:
    • The instruction definition's description still says the op is guaranteed not to be removed. That's not true anymore (in fact that's the point of this PR!); can we update to say something about not picking the wrong input even under a mis-speculated branch (as you do in the comment in the ISLE rewrites)?
    • Let's note in the ISLE rewrites that removing select_spectre_guard, if semantically correct for the functional result, always meets this speculation condition: by removing an op we are only ever going to remove computation that could be speculated, we're never adding new speculation.
    • Finally, can we add a comment in around the instruction description noting that the op is pure because our reasoning about speculation safety tells us it's always safe to remove if functionally correct, and safe to merge with another functionally identical op.

Thanks!

@github-actions github-actions Bot added cranelift Issues related to the Cranelift code generator cranelift:meta Everything related to the meta-language. isle Related to the ISLE domain-specific language labels Mar 14, 2024
@github-actions
Copy link
Copy Markdown

Subscribe to Label Action

cc @cfallin, @fitzgen

Details This issue or pull request has been labeled: "cranelift", "cranelift:meta", "isle"

Thus the following users have been cc'd because of the following labels:

  • cfallin: isle
  • fitzgen: isle

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

jameysharp added a commit to jameysharp/wasmtime that referenced this pull request Mar 18, 2024
Currently, every access to a table element does a bounds-check with a
conditional branch to a block that explicitly traps.

Instead, when SPECTRE mitigations are enabled, let's change the address
computation to return a null pointer for out-of-bounds accesses, and
then allow the subsequent load or store to trap.

This is less code in that case since we can reuse instructions we needed
anyway.

In addition, when the table has constant size and the element index is a
constant and mid-end optimization is enabled, this allows the
bounds-check to be constant folded away. Later, bytecodealliance#8139 will let us
optimize away the select_spectre_guard instruction in this case too.

Once we also implement bytecodealliance#8160, `tests/disas/typed-funcrefs.wat` should be
almost as fast as native indirect function calls.
jameysharp added a commit to jameysharp/wasmtime that referenced this pull request Mar 19, 2024
Currently, every access to a table element does a bounds-check with a
conditional branch to a block that explicitly traps.

Instead, when Spectre mitigations are enabled, let's change the address
computation to return a null pointer for out-of-bounds accesses, and
then allow the subsequent load or store to trap.

This is less code in that case since we can reuse instructions we needed
anyway.

We return the MemFlags that the memory access should use, in addition to
the address it should access. That way we don't record trap metadata on
memory access instructions which can't actually trap due to being
preceded by a `trapnz`-based bounds check, when Spectre mitigations are
disabled.

In addition, when the table has constant size and the element index is a
constant and mid-end optimization is enabled, this allows the
bounds-check to be constant folded away. Later, bytecodealliance#8139 will let us
optimize away the select_spectre_guard instruction in this case too.

Once we also implement bytecodealliance#8160, `tests/disas/typed-funcrefs.wat` should be
almost as fast as native indirect function calls.
github-merge-queue Bot pushed a commit that referenced this pull request Mar 19, 2024
Currently, every access to a table element does a bounds-check with a
conditional branch to a block that explicitly traps.

Instead, when Spectre mitigations are enabled, let's change the address
computation to return a null pointer for out-of-bounds accesses, and
then allow the subsequent load or store to trap.

This is less code in that case since we can reuse instructions we needed
anyway.

We return the MemFlags that the memory access should use, in addition to
the address it should access. That way we don't record trap metadata on
memory access instructions which can't actually trap due to being
preceded by a `trapnz`-based bounds check, when Spectre mitigations are
disabled.

In addition, when the table has constant size and the element index is a
constant and mid-end optimization is enabled, this allows the
bounds-check to be constant folded away. Later, #8139 will let us
optimize away the select_spectre_guard instruction in this case too.

Once we also implement #8160, `tests/disas/typed-funcrefs.wat` should be
almost as fast as native indirect function calls.
This commit makes two changes to our treatment of
`select_spectre_guard`.

First, stop annotating this instruction as having any side effects. We
only care that if its value result is used, then it's computed without
branching on the condition input. We don't otherwise care when the value
is computed, or if it's computed at all.

Second, introduce some carefully selected ISLE egraph rewrites for this
instruction. These particular rewrites are those where we can statically
determine which SSA value will be the result of the instruction. Since
there is no actual choice involved, there's no way to accidentally
introduce speculation on the condition input.
@jameysharp jameysharp requested a review from a team as a code owner March 19, 2024 05:56
@jameysharp
Copy link
Copy Markdown
Contributor Author

I think I've addressed all your comments about documentation, Chris, and I'd appreciate feedback on that aspect. I still need to write filetests exercising these new rules, but at least now we can see that this has a positive effect on the typed-funcref tests, as expected.

Copy link
Copy Markdown
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New doc comments look great -- thanks!

@jameysharp jameysharp enabled auto-merge March 20, 2024 22:16
@jameysharp jameysharp added this pull request to the merge queue Mar 20, 2024
Merged via the queue into bytecodealliance:main with commit f59b324 Mar 20, 2024
@jameysharp jameysharp deleted the optimize-spectre branch March 20, 2024 22:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cranelift:meta Everything related to the meta-language. cranelift Issues related to the Cranelift code generator isle Related to the ISLE domain-specific language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants