Skip to content

aarch64: Fix Ldr19 relocations being unresolvable #6384

Merged
alexcrichton merged 10 commits into
bytecodealliance:mainfrom
alexcrichton:fix-ldr19-out-of-bounds
May 16, 2023
Merged

aarch64: Fix Ldr19 relocations being unresolvable #6384
alexcrichton merged 10 commits into
bytecodealliance:mainfrom
alexcrichton:fix-ldr19-out-of-bounds

Conversation

@alexcrichton
Copy link
Copy Markdown
Member

This commit is a suite of fixes and support to fix a bug on the AArch64 backend, and others. Before this commit constants were unconditionally placed at the end of a function which means that if the function was too large then references from the beginning of the function may be too far away. The fix here is a bit subtle but is generally to emit all previously referenced constants when an island is emitted, resetting constants to get emitted on the next island. This means that constants may be emitted multiple times throughout function emission, which solves both a reference being too far in front of you and behind you.

I've done testing of this with a new Cranelift setting that automatically inserts padding between basic blocks. This setting is additionally hooked up to the fuzzer to help uncover issues like this on OSS-Fuzz.

Various relocations, jumps, and such require special handling in
`MachBuffer` with respect to islands to ensure that everything gets
emitted correctly. This commit adds a setting to synthetically insert
padding at the end of every basic block to help stress this logic with
more minimal test cases. The setting is disabled by default but is
something that we should be able to turn on during fuzzing, for example.
This commit fixes a bug in the AArch64 backend, and possibly others,
where constants were unconditionally forced to be at the end of the
function when they sometimes couldn't be. For example the `Ldr19`
relocation has a 512k range meaning that if an instruction near the
beginning of a function accesses a constant at the end of a function and
the function is >1M, then the relocation cannot be resolved. This is all
handled internally with `MachBuffer`'s handling of islands but the
problem with constants is that the labels (and the constant values)
weren't defined until the end of the function.

The first attempt at fixing this was to move the calls to
`defer_constant` to the beginning of emission. This would enable the
constants to get deferred as necessary. This was problematic, however,
because it only solved the forwards case (aka your constant was forced
to the end of the function which is too far away). The backwards case,
aka your constant is way too far behind you, was a new problem that
arose.

To fix all of these issues constants are now handled differently inside
of the `MachBuffer`. Previously constants were all pre-assigned a
label-per-constant and all references to the constant would use that
single label. Instead a new heuristic has been added where constants
record their size/alignment at the start of emission and labels are
lazily deferred. When a label for a constant is requested then a label
is lazily allocated or a previously-allocated label for this constant is
returned. When an island is emitted then all emitted constants get
their labels cleared. This intends to balance the previous functionality
of multiple uses of a constant only emit the constant once with fixing
this issue with simplicity as well. This means that constants may get
emitted multiple times, since each reference to a constant after an
island is generated will be guaranteed to generate a new label, even if
it's in-range to access. This can perhaps be fixed in the future with a
more clever API where the `LabelUse` is passed into the function which
converts a constant to a label, but that's left as a refactoring for a
future date.

This commit also moves an `alignment: u32` field into the
`MachBufferFinalized` itself since that's now a function of whatever
constants actually got emitted. Additionally note that constant
emission in the middle of a function doesn't actually emit anything,
instead recording markers of where constants need to go. Then when a
buffer is finalized the constants are passed in to get access to the
data which fills in everything as it's referenced.
This commit hooks up the previously-added setting to Cranelift to
Wasmtime's fuzzing infrastructure. This will automatically configure the
setting based on the fuzz input to add a bit of "chaos" to the emitted
code. This should hopefully help expose the issue fixed previously via
fuzzing which otherwise won't generate massive functions.
@alexcrichton alexcrichton requested review from a team as code owners May 16, 2023 01:29
@alexcrichton alexcrichton requested review from jameysharp and removed request for a team May 16, 2023 01:29
@cfallin cfallin self-assigned this May 16, 2023
@cfallin
Copy link
Copy Markdown
Member

cfallin commented May 16, 2023

(Assigning myself -- talked with Alex earlier today and happy to take this one)

Copy link
Copy Markdown
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks generally great -- thanks for tackling this! A few comments below mostly about cosmetics/docs but otherwise LGTM.

Comment thread cranelift/codegen/meta/src/shared/settings.rs
Comment thread cranelift/codegen/src/machinst/mod.rs Outdated
Comment thread cranelift/codegen/src/machinst/buffer.rs Outdated
Comment thread cranelift/codegen/src/machinst/buffer.rs Outdated
@github-actions github-actions Bot added cranelift Issues related to the Cranelift code generator cranelift:area:machinst Issues related to instruction selection and the new MachInst backend. cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:area:x64 Issues related to x64 codegen cranelift:meta Everything related to the meta-language. cranelift:module fuzzing Issues related to our fuzzing infrastructure wasmtime:api Related to the API of the `wasmtime` crate itself winch Winch issues or pull requests labels May 16, 2023
@github-actions
Copy link
Copy Markdown

Subscribe to Label Action

cc @fitzgen, @peterhuene, @saulecabrera

Details This issue or pull request has been labeled: "cranelift", "cranelift:area:aarch64", "cranelift:area:machinst", "cranelift:area:x64", "cranelift:meta", "cranelift:module", "fuzzing", "wasmtime:api", "winch"

Thus the following users have been cc'd because of the following labels:

  • fitzgen: fuzzing
  • peterhuene: wasmtime:api
  • saulecabrera: winch

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

Otherwise misaligned instructions were getting emitted and tripping
various asserts.
@afonso360
Copy link
Copy Markdown
Contributor

Hey, I was playing around with adding this setting to fuzzgen and it found a panic on AArch64, looks like when setting bb_padding_log2 to 1, (I suspect) we get an unaligned basic block, which causes issues.

Testcase
test compile
set bb_padding_log2=1
target aarch64

function u1:0(i64) -> i64{
block0(v0: i64):
    jump block1

block1:
    return v0
}

Panic:

thread 'worker #0' panicked at 'assertion failed: pc_rel & 3 == 0', cranelift/codegen/src/isa/aarch64/inst/mod.rs:2847:17
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
 ERROR cranelift_filetests::concurrent > FAIL: panicked in worker #0: assertion failed: pc_rel & 3 == 0
FAIL ./lmao.clif: panicked in worker #0: assertion failed: pc_rel & 3 == 0
1 tests
Error: 1 failure

Comment thread cranelift/filetests/filetests/runtests/bb-padding.clif Outdated
@alexcrichton
Copy link
Copy Markdown
Member Author

we get an unaligned basic block, which causes issues.

Indeed! Fuzzing for me locally turned that up quite quickly and I forgot to push that up last night. I added a re-alignment now to get past that. Fuzzing found another panic last night so I'm going to debug that before merging this.

Allow for inserting one byte of padding.
Don't use `EmitInfo`, instead pass in to vcode emission
Fixes an off-by-just-a-few error if the two island checks are done
separately after a basic block.
@alexcrichton alexcrichton enabled auto-merge May 16, 2023 15:07
@alexcrichton alexcrichton added this pull request to the merge queue May 16, 2023
Merged via the queue into bytecodealliance:main with commit 49dd8fd May 16, 2023
@alexcrichton alexcrichton deleted the fix-ldr19-out-of-bounds branch May 16, 2023 16:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:area:machinst Issues related to instruction selection and the new MachInst backend. cranelift:area:x64 Issues related to x64 codegen cranelift:meta Everything related to the meta-language. cranelift:module cranelift Issues related to the Cranelift code generator fuzzing Issues related to our fuzzing infrastructure wasmtime:api Related to the API of the `wasmtime` crate itself winch Winch issues or pull requests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants