aarch64: Fix Ldr19 relocations being unresolvable #6384
Conversation
Various relocations, jumps, and such require special handling in `MachBuffer` with respect to islands to ensure that everything gets emitted correctly. This commit adds a setting to synthetically insert padding at the end of every basic block to help stress this logic with more minimal test cases. The setting is disabled by default but is something that we should be able to turn on during fuzzing, for example.
This commit fixes a bug in the AArch64 backend, and possibly others, where constants were unconditionally forced to be at the end of the function when they sometimes couldn't be. For example the `Ldr19` relocation has a 512k range meaning that if an instruction near the beginning of a function accesses a constant at the end of a function and the function is >1M, then the relocation cannot be resolved. This is all handled internally with `MachBuffer`'s handling of islands but the problem with constants is that the labels (and the constant values) weren't defined until the end of the function. The first attempt at fixing this was to move the calls to `defer_constant` to the beginning of emission. This would enable the constants to get deferred as necessary. This was problematic, however, because it only solved the forwards case (aka your constant was forced to the end of the function which is too far away). The backwards case, aka your constant is way too far behind you, was a new problem that arose. To fix all of these issues constants are now handled differently inside of the `MachBuffer`. Previously constants were all pre-assigned a label-per-constant and all references to the constant would use that single label. Instead a new heuristic has been added where constants record their size/alignment at the start of emission and labels are lazily deferred. When a label for a constant is requested then a label is lazily allocated or a previously-allocated label for this constant is returned. When an island is emitted then all emitted constants get their labels cleared. This intends to balance the previous functionality of multiple uses of a constant only emit the constant once with fixing this issue with simplicity as well. This means that constants may get emitted multiple times, since each reference to a constant after an island is generated will be guaranteed to generate a new label, even if it's in-range to access. This can perhaps be fixed in the future with a more clever API where the `LabelUse` is passed into the function which converts a constant to a label, but that's left as a refactoring for a future date. This commit also moves an `alignment: u32` field into the `MachBufferFinalized` itself since that's now a function of whatever constants actually got emitted. Additionally note that constant emission in the middle of a function doesn't actually emit anything, instead recording markers of where constants need to go. Then when a buffer is finalized the constants are passed in to get access to the data which fills in everything as it's referenced.
This commit hooks up the previously-added setting to Cranelift to Wasmtime's fuzzing infrastructure. This will automatically configure the setting based on the fuzz input to add a bit of "chaos" to the emitted code. This should hopefully help expose the issue fixed previously via fuzzing which otherwise won't generate massive functions.
|
(Assigning myself -- talked with Alex earlier today and happy to take this one) |
cfallin
left a comment
There was a problem hiding this comment.
This looks generally great -- thanks for tackling this! A few comments below mostly about cosmetics/docs but otherwise LGTM.
Subscribe to Label Actioncc @fitzgen, @peterhuene, @saulecabrera DetailsThis issue or pull request has been labeled: "cranelift", "cranelift:area:aarch64", "cranelift:area:machinst", "cranelift:area:x64", "cranelift:meta", "cranelift:module", "fuzzing", "wasmtime:api", "winch"Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
Otherwise misaligned instructions were getting emitted and tripping various asserts.
|
Hey, I was playing around with adding this setting to fuzzgen and it found a panic on AArch64, looks like when setting TestcasePanic: |
Indeed! Fuzzing for me locally turned that up quite quickly and I forgot to push that up last night. I added a re-alignment now to get past that. Fuzzing found another panic last night so I'm going to debug that before merging this. |
Allow for inserting one byte of padding.
Don't use `EmitInfo`, instead pass in to vcode emission
Fixes an off-by-just-a-few error if the two island checks are done separately after a basic block.
This commit is a suite of fixes and support to fix a bug on the AArch64 backend, and others. Before this commit constants were unconditionally placed at the end of a function which means that if the function was too large then references from the beginning of the function may be too far away. The fix here is a bit subtle but is generally to emit all previously referenced constants when an island is emitted, resetting constants to get emitted on the next island. This means that constants may be emitted multiple times throughout function emission, which solves both a reference being too far in front of you and behind you.
I've done testing of this with a new Cranelift setting that automatically inserts padding between basic blocks. This setting is additionally hooked up to the fuzzer to help uncover issues like this on OSS-Fuzz.