Add CHERI-safe memmove implementation#823
Open
devnexen wants to merge 5 commits intomicrosoft:mainfrom
Open
Conversation
New attempt at custom memmove, succeeding PR microsoft#593 which was reverted due to fuzzer-found bugs (off-by-one in reverse copy loop with size_t underflow) and CHERI incompatibility (byte-at-a-time reverse copy destroys capability tags). Key changes from the original PR microsoft#593: - Per-Arch move() and forward_move() methods following the existing copy() pattern, instead of a single generic byte-by-byte reverse. - Three-way overlap detection: non-overlapping uses optimized Arch::copy(), dst > src overlap uses Arch::move() (reverse), and dst < src overlap uses Arch::forward_move() (forward without the copy_end trick that re-reads already-overwritten bytes). - block_reverse_copy<Size> operates at register width (16 bytes on x86-64/PPC64/CHERI) instead of byte-by-byte, with byte-by-byte only for the sub-register remainder. - GenericStrictProvenance (CHERI) move() and forward_move() preserve capability tags by using pointer-pair (Ptr2) operations on aligned regions, with byte-by-byte only for sub-pointer head/tail padding where no aligned capabilities can exist. - Comprehensive tests: overlapping copies at various sizes (1-2048) and overlap amounts, exhaustive offset x length testing for small buffers (2-64), alignment boundary tests, bounds checking, and direct snmalloc::memmove<false> unchecked path tests. - Re-enables memmove fuzz tests (simple_memmove, forward_memmove, backward_memmove) in snmalloc-fuzzer.cpp. Co-authored-by: Claude <noreply@anthropic.com>
Collaborator
Author
|
So to summarize, Claude filled the gap in regard of CHERI here. |
Add copy_one_move<Size> that always uses struct-copy instead of __builtin_memcpy_inline, which ASan treats as memcpy and flags overlapping src/dst as an error. Use it in block_reverse_copy and a new block_copy_move for all forward_move/move overlap paths.
Move all memmove tests into a new src/test/func/memmove/ directory so they build as func-memmove-fast and func-memmove-check, separate from func-memcpy-fast and func-memcpy-check. This makes it clear which test is failing when CI reports errors.
The struct copy (*d = *s) in copy_one_move was being lowered by the compiler into a memcpy call, which ASan then flagged as memcpy-param-overlap for memmove's overlapping buffers. Switch to __builtin_memmove which correctly handles overlap and still optimizes to register-width loads/stores. Add a byte-by-byte fallback for compilers lacking __builtin_memmove.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
New attempt at custom memmove, succeeding PR #593 which was reverted due to fuzzer-found bugs (off-by-one in reverse copy loop with size_t underflow) and CHERI incompatibility (byte-at-a-time reverse copy destroys capability tags).
Key changes from the original PR #593:
Per-Arch move() and forward_move() methods following the existing copy() pattern, instead of a single generic byte-by-byte reverse.
Three-way overlap detection: non-overlapping uses optimized Arch::copy(), dst > src overlap uses Arch::move() (reverse), and dst < src overlap uses Arch::forward_move() (forward without the copy_end trick that re-reads already-overwritten bytes).
block_reverse_copy operates at register width (16 bytes on x86-64/PPC64/CHERI) instead of byte-by-byte, with byte-by-byte only for the sub-register remainder.
GenericStrictProvenance (CHERI) move() and forward_move() preserve capability tags by using pointer-pair (Ptr2) operations on aligned regions, with byte-by-byte only for sub-pointer head/tail padding where no aligned capabilities can exist.
Comprehensive tests: overlapping copies at various sizes (1-2048) and overlap amounts, exhaustive offset x length testing for small buffers (2-64), alignment boundary tests, bounds checking, and direct snmalloc::memmove unchecked path tests.
Re-enables memmove fuzz tests (simple_memmove, forward_memmove, backward_memmove) in snmalloc-fuzzer.cpp.