-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Flush Icache on AArch64 Windows #4997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
cfallin
merged 28 commits into
bytecodealliance:main
from
afonso360:windows-aarch64-icache
Oct 12, 2022
Merged
Changes from all commits
Commits
Show all changes
28 commits
Select commit
Hold shift + click to select a range
7e05702
cranelift: Add FlushInstructionCache for AArch64 on Windows
afonso360 d4741f8
wasmtime: Add FlushInstructionCache for AArch64 on Windows
afonso360 aaaa153
cranelift: Add MemoryUse flag to JIT Memory Manager
afonso360 3d771e2
Add jit-icache-coherence crate
afonso360 db720da
cranelift: Use `jit-icache-coherence`
afonso360 be885d9
wasmtime: Use `jit-icache-coherence`
afonso360 7f9b719
jit-icache-coherence: Make rustix feature additive
afonso360 c6f68f8
wasmtime: Remove rustix from wasmtime-jit
afonso360 e377383
Rename wasmtime-jit-icache-coherency crate
afonso360 5cfc63d
Use cfg-if in wasmtime-jit-icache-coherency crate
afonso360 ec7a11c
Use inline instead of inline(always)
afonso360 55f08c0
Add unsafe marker to clear_cache
afonso360 b3a1332
Conditionally compile all rustix operations
afonso360 16b456f
Publish `wasmtime-jit-icache-coherence`
afonso360 0c85621
Remove explicit windows check
afonso360 349f7fa
cranelift: Remove len != 0 check
afonso360 0eef7b0
Comment cleanups
afonso360 21165d8
Make clear_cache safe
afonso360 e8cf00e
Rename pipeline_flush to pipeline_flush_mt
afonso360 a7375b8
Revert "Make clear_cache safe"
afonso360 b4a45d2
More docs!
afonso360 04ff662
Fix pipeline_flush reference on clear_cache
afonso360 f27a014
Update more docs!
afonso360 f302291
Move pipeline flush after `mprotect` calls
afonso360 125d63c
wasmtime: Remove rustix backend from icache crate
afonso360 1a39079
wasmtime: Use libc for macos
afonso360 4e9a52a
wasmtime: Flush icache on all arch's for windows
afonso360 79001aa
wasmtime: Add flags to membarrier call
afonso360 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,23 @@ | ||
| [package] | ||
| name = "wasmtime-jit-icache-coherence" | ||
| version = "2.0.0" | ||
| authors.workspace = true | ||
| description = "Utilities for JIT icache maintenance" | ||
| documentation = "https://docs.rs/jit-icache-coherence" | ||
| license = "Apache-2.0 WITH LLVM-exception" | ||
| repository = "https://github.com/bytecodealliance/wasmtime" | ||
| edition.workspace = true | ||
|
|
||
| [dependencies] | ||
| cfg-if = "1.0" | ||
|
|
||
| [target.'cfg(target_os = "windows")'.dependencies.windows-sys] | ||
| workspace = true | ||
| features = [ | ||
| "Win32_Foundation", | ||
| "Win32_System_Threading", | ||
| "Win32_System_Diagnostics_Debug", | ||
| ] | ||
|
|
||
| [target.'cfg(any(target_os = "linux", target_os = "macos"))'.dependencies.libc] | ||
| version = "0.2.42" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,105 @@ | ||
| //! This crate provides utilities for instruction cache maintenance for JIT authors. | ||
| //! | ||
| //! In self modifying codes such as when writing a JIT, special care must be taken when marking the | ||
| //! code as ready for execution. On fully coherent architectures (X86, S390X) the data cache (D-Cache) | ||
| //! and the instruction cache (I-Cache) are always in sync. However this is not guaranteed for all | ||
| //! architectures such as AArch64 where these caches are not coherent with each other. | ||
| //! | ||
| //! When writing new code there may be a I-cache entry for that same address which causes the | ||
| //! processor to execute whatever was in the cache instead of the new code. | ||
| //! | ||
| //! See the [ARM Community - Caches and Self-Modifying Code] blog post that contains a great | ||
| //! explanation of the above. (It references AArch32 but it has a high level overview of this problem). | ||
| //! | ||
| //! ## Usage | ||
| //! | ||
| //! You should call [clear_cache] on any pages that you write with the new code that you're intending | ||
| //! to execute. You can do this at any point in the code from the moment that you write the page up to | ||
| //! the moment where the code is executed. | ||
| //! | ||
| //! You also need to call [pipeline_flush_mt] to ensure that there isn't any invalid instruction currently | ||
| //! in the pipeline if you are running in a multi threaded environment. | ||
| //! | ||
| //! For single threaded programs you are free to omit [pipeline_flush_mt], otherwise you need to | ||
| //! call both [clear_cache] and [pipeline_flush_mt] in that order. | ||
| //! | ||
| //! ### Example: | ||
| //! ``` | ||
| //! # use std::ffi::c_void; | ||
| //! # use std::io; | ||
| //! # use wasmtime_jit_icache_coherence::*; | ||
| //! # | ||
| //! # struct Page { | ||
| //! # addr: *const c_void, | ||
| //! # len: usize, | ||
| //! # } | ||
| //! # | ||
| //! # fn main() -> io::Result<()> { | ||
| //! # | ||
| //! # let run_code = || {}; | ||
| //! # let code = vec![0u8; 64]; | ||
| //! # let newly_written_pages = vec![Page { | ||
| //! # addr: &code[0] as *const u8 as *const c_void, | ||
| //! # len: code.len(), | ||
| //! # }]; | ||
| //! # unsafe { | ||
| //! // Invalidate the cache for all the newly written pages where we wrote our new code. | ||
| //! for page in newly_written_pages { | ||
| //! clear_cache(page.addr, page.len)?; | ||
| //! } | ||
| //! | ||
| //! // Once those are invalidated we also need to flush the pipeline | ||
| //! pipeline_flush_mt()?; | ||
| //! | ||
| //! // We can now safely execute our new code. | ||
| //! run_code(); | ||
| //! # } | ||
| //! # Ok(()) | ||
| //! # } | ||
| //! ``` | ||
| //! | ||
| //! <div class="example-wrap" style="display:inline-block"><pre class="compile_fail" style="white-space:normal;font:inherit;"> | ||
| //! | ||
| //! **Warning**: In order to correctly use this interface you should always call [clear_cache]. | ||
| //! A followup call to [pipeline_flush_mt] is required if you are running in a multi-threaded environment. | ||
| //! | ||
| //! </pre></div> | ||
| //! | ||
| //! [ARM Community - Caches and Self-Modifying Code]: https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/caches-and-self-modifying-code | ||
|
afonso360 marked this conversation as resolved.
|
||
|
|
||
| use std::ffi::c_void; | ||
| use std::io::Result; | ||
|
|
||
| cfg_if::cfg_if! { | ||
| if #[cfg(target_os = "windows")] { | ||
| mod win; | ||
| use win as imp; | ||
| } else { | ||
| mod libc; | ||
| use crate::libc as imp; | ||
| } | ||
| } | ||
|
|
||
| /// Flushes instructions in the processor pipeline | ||
|
afonso360 marked this conversation as resolved.
|
||
| /// | ||
| /// This pipeline flush is broadcast to all processors that are executing threads in the current process. | ||
| /// | ||
| /// Calling [pipeline_flush_mt] is only required for multi-threaded programs and it *must* be called | ||
| /// after all calls to [clear_cache]. | ||
| /// | ||
| /// If the architecture does not require a pipeline flush, this function does nothing. | ||
| pub fn pipeline_flush_mt() -> Result<()> { | ||
| imp::pipeline_flush_mt() | ||
| } | ||
|
|
||
| /// Flushes the instruction cache for a region of memory. | ||
| /// | ||
| /// If the architecture does not require an instruction cache flush, this function does nothing. | ||
| /// | ||
| /// # Unsafe | ||
| /// | ||
| /// It is necessary to call [pipeline_flush_mt] after this function if you are running in a multi-threaded | ||
| /// environment. | ||
| pub unsafe fn clear_cache(ptr: *const c_void, len: usize) -> Result<()> { | ||
| imp::clear_cache(ptr, len) | ||
| } | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not strictly necessary for this PR, but I wonder how RISC-V fits into this -- it looks like at the ISA level it has a
fence.iinstruction, so it is closer to AArch64 in this regard (weaker coherence by default). Is it enough to do the samemembarriercalls as onaarch64? (cc @yuyang-ok)In the absence of any other information, perhaps we could perform the same
membarriercalls on RISC-V as we do on aarch64?Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that we do need to do something, from what I've read RISCV is allowed to have incoherent I and D caches. From this documentation of the kernel, it looks like CORE_SYNC is not yet implemented for RISCV. I'm not sure they support GLOBAL either.
I've tried to read the kernel a bit, and from what I understand they have a custom syscall that does sort of what we want? But it looks like it does not guarantee anything regarding pipelines.
Edit: That syscall ends up doing something very similar to AArch64 where they execute a
fence.ion all cores. (link)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an architectural detail - I am not familiar with RISC-V at all, but it is possible that the architecture specifies that if instruction caches are flushed, then the pipeline might be flushed as well if necessary, hence no need to do anything in addition; on AArch64 these actions are decoupled. Or to put it another way - an architecture having incoherent data and instruction caches does not imply that it behaves in exactly the same way as the 64-bit Arm architecture (and hence requiring exactly the same sequence of actions); possibly there are nuances.
BTW the system call you have linked to says that it can be made to apply to all threads in the process, not just the caller, which might be what you are looking for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's right, we should go and double check that!
I've opened #5033 to track this, but I'm going to look at the ISA manual to check if they guarantee anything like that.