Skip to content

Add CHERI-safe memmove implementation#823

Open
devnexen wants to merge 5 commits intomicrosoft:mainfrom
devnexen:memmove_try2
Open

Add CHERI-safe memmove implementation#823
devnexen wants to merge 5 commits intomicrosoft:mainfrom
devnexen:memmove_try2

Conversation

@devnexen
Copy link
Collaborator

@devnexen devnexen commented Mar 3, 2026

New attempt at custom memmove, succeeding PR #593 which was reverted due to fuzzer-found bugs (off-by-one in reverse copy loop with size_t underflow) and CHERI incompatibility (byte-at-a-time reverse copy destroys capability tags).

Key changes from the original PR #593:

  • Per-Arch move() and forward_move() methods following the existing copy() pattern, instead of a single generic byte-by-byte reverse.

  • Three-way overlap detection: non-overlapping uses optimized Arch::copy(), dst > src overlap uses Arch::move() (reverse), and dst < src overlap uses Arch::forward_move() (forward without the copy_end trick that re-reads already-overwritten bytes).

  • block_reverse_copy operates at register width (16 bytes on x86-64/PPC64/CHERI) instead of byte-by-byte, with byte-by-byte only for the sub-register remainder.

  • GenericStrictProvenance (CHERI) move() and forward_move() preserve capability tags by using pointer-pair (Ptr2) operations on aligned regions, with byte-by-byte only for sub-pointer head/tail padding where no aligned capabilities can exist.

  • Comprehensive tests: overlapping copies at various sizes (1-2048) and overlap amounts, exhaustive offset x length testing for small buffers (2-64), alignment boundary tests, bounds checking, and direct snmalloc::memmove unchecked path tests.

  • Re-enables memmove fuzz tests (simple_memmove, forward_memmove, backward_memmove) in snmalloc-fuzzer.cpp.

New attempt at custom memmove, succeeding PR microsoft#593 which was reverted
due to fuzzer-found bugs (off-by-one in reverse copy loop with size_t
underflow) and CHERI incompatibility (byte-at-a-time reverse copy
destroys capability tags).

Key changes from the original PR microsoft#593:

- Per-Arch move() and forward_move() methods following the existing
  copy() pattern, instead of a single generic byte-by-byte reverse.

- Three-way overlap detection: non-overlapping uses optimized
  Arch::copy(), dst > src overlap uses Arch::move() (reverse), and
  dst < src overlap uses Arch::forward_move() (forward without the
  copy_end trick that re-reads already-overwritten bytes).

- block_reverse_copy<Size> operates at register width (16 bytes on
  x86-64/PPC64/CHERI) instead of byte-by-byte, with byte-by-byte
  only for the sub-register remainder.

- GenericStrictProvenance (CHERI) move() and forward_move() preserve
  capability tags by using pointer-pair (Ptr2) operations on aligned
  regions, with byte-by-byte only for sub-pointer head/tail padding
  where no aligned capabilities can exist.

- Comprehensive tests: overlapping copies at various sizes (1-2048)
  and overlap amounts, exhaustive offset x length testing for small
  buffers (2-64), alignment boundary tests, bounds checking, and
  direct snmalloc::memmove<false> unchecked path tests.

- Re-enables memmove fuzz tests (simple_memmove, forward_memmove,
  backward_memmove) in snmalloc-fuzzer.cpp.

Co-authored-by: Claude <noreply@anthropic.com>
@devnexen devnexen marked this pull request as draft March 3, 2026 22:54
@devnexen
Copy link
Collaborator Author

devnexen commented Mar 3, 2026

So to summarize, Claude filled the gap in regard of CHERI here.

devnexen added 4 commits March 3, 2026 23:12
Add copy_one_move<Size> that always uses struct-copy instead of
__builtin_memcpy_inline, which ASan treats as memcpy and flags
overlapping src/dst as an error. Use it in block_reverse_copy and
a new block_copy_move for all forward_move/move overlap paths.
Move all memmove tests into a new src/test/func/memmove/ directory
so they build as func-memmove-fast and func-memmove-check, separate
from func-memcpy-fast and func-memcpy-check. This makes it clear
which test is failing when CI reports errors.
The struct copy (*d = *s) in copy_one_move was being lowered by the
compiler into a memcpy call, which ASan then flagged as
memcpy-param-overlap for memmove's overlapping buffers. Switch to
__builtin_memmove which correctly handles overlap and still optimizes
to register-width loads/stores. Add a byte-by-byte fallback for
compilers lacking __builtin_memmove.
@devnexen devnexen marked this pull request as ready for review March 4, 2026 05:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant