Skip to content

Fix AArch64 ABI to respect half-caller-save, half-callee-save vec regs.#2267

Merged
cfallin merged 1 commit into
bytecodealliance:mainfrom
cfallin:fix-aarch64-abi
Oct 6, 2020
Merged

Fix AArch64 ABI to respect half-caller-save, half-callee-save vec regs.#2267
cfallin merged 1 commit into
bytecodealliance:mainfrom
cfallin:fix-aarch64-abi

Conversation

@cfallin
Copy link
Copy Markdown
Member

@cfallin cfallin commented Oct 6, 2020

This PR updates the AArch64 ABI implementation so that it (i) properly
respects that v8-v15 inclusive have callee-save lower halves, and
caller-save upper halves, by conservatively approximating (to full
registers) in the appropriate directions when generating prologue
caller-saves and when informing the regalloc of clobbered regs across
callsites.

In order to prevent saving all of these vector registers in the prologue
of every non-leaf function due to the above approximation, this also
makes use of a new regalloc.rs feature to exclude call instructions'
writes from the clobber set returned by register allocation. This is
safe whenever the caller and callee have the same ABI (because anything
the callee could clobber, the caller is allowed to clobber as well
without saving it in the prologue).

Fixes #2254.

@cfallin cfallin requested a review from julian-seward1 October 6, 2020 01:47
@github-actions github-actions Bot added cranelift Issues related to the Cranelift code generator cranelift:area:machinst Issues related to instruction selection and the new MachInst backend. cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:area:x64 Issues related to x64 codegen labels Oct 6, 2020
Comment thread cranelift/codegen/src/isa/aarch64/inst/mod.rs
Comment thread cranelift/codegen/src/isa/aarch64/inst/mod.rs
Comment thread cranelift/codegen/src/isa/aarch64/inst/mod.rs
Copy link
Copy Markdown
Contributor

@julian-seward1 julian-seward1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functionally fine; but some concern about creeping ambiguity in naming. Can we discuss revised names?

Comment thread cranelift/codegen/src/isa/aarch64/abi.rs
Comment thread cranelift/codegen/src/isa/aarch64/inst/mod.rs
@cfallin
Copy link
Copy Markdown
Member Author

cfallin commented Oct 6, 2020

Thanks! Updated based on comments. I definitely like the renaming in terms of explicit actions ("clobbered by call", "saved in prologue") in the ABI impls. The only open question is the naming of is_included_in_clobbers() in the MachInst interface; I'm curious if you have any other good names in mind. Open to whatever makes sense!

@akirilov-arm
Copy link
Copy Markdown
Contributor

I also had a quick look at the changes and they are good, but now I have realized that there is a gap in the tests - there is nothing that covers the callee side. However, that is definitely a job for another PR, in particular one that is going to fix function prologues and epilogues, so that they deal only with the lower 64 bits of the SIMD & FP registers. At the very least we should have a function with a chain of operations such as v1 = v0 + v0, v2 = v1 + v1, ..., f(v0, v1, v2, ...) to force the compiler to use the callee-saved registers, possibly mixing data types (f32, f64 and i8x16).

This PR updates the AArch64 ABI implementation so that it (i) properly
respects that v8-v15 inclusive have callee-save lower halves, and
caller-save upper halves, by conservatively approximating (to full
registers) in the appropriate directions when generating prologue
caller-saves and when informing the regalloc of clobbered regs across
callsites.

In order to prevent saving all of these vector registers in the prologue
of every non-leaf function due to the above approximation, this also
makes use of a new regalloc.rs feature to exclude call instructions'
writes from the clobber set returned by register allocation. This is
safe whenever the caller and callee have the same ABI (because anything
the callee could clobber, the caller is allowed to clobber as well
without saving it in the prologue).

Fixes bytecodealliance#2254.
@cfallin
Copy link
Copy Markdown
Member Author

cfallin commented Oct 6, 2020

@akirilov-arm good point; thanks! I just updated the PR to include a test (prologue.clif) that does exactly that, and checks that all of the callee-saves are saved (lower halves of v8-v15, covered by saving all of v8-v15 currently). Let me know if this covers what you had in mind.

@akirilov-arm
Copy link
Copy Markdown
Contributor

@cfallin Yes, and it is simpler than what I had in mind, which is even better. I think that together with the other tests it will be a good exercise for an optimal with respect to the AAPCS64 implementation (trying to be a little bit forward-thinking here), and yet it demonstrates the current issues that are simpler, namely handling the full registers and the lack of paired loads and stores.

@cfallin cfallin merged commit fc430ee into bytecodealliance:main Oct 6, 2020
@cfallin cfallin deleted the fix-aarch64-abi branch October 6, 2020 22:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:area:machinst Issues related to instruction selection and the new MachInst backend. cranelift:area:x64 Issues related to x64 codegen cranelift Issues related to the Cranelift code generator

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AArch64: conform to AArch64 calling convention's half-caller/half-callee-save vector register conventions.

4 participants