Fix AArch64 ABI to respect half-caller-save, half-callee-save vec regs.#2267
Conversation
julian-seward1
left a comment
There was a problem hiding this comment.
Functionally fine; but some concern about creeping ambiguity in naming. Can we discuss revised names?
37240d9 to
e056e1e
Compare
|
Thanks! Updated based on comments. I definitely like the renaming in terms of explicit actions ("clobbered by call", "saved in prologue") in the ABI impls. The only open question is the naming of |
|
I also had a quick look at the changes and they are good, but now I have realized that there is a gap in the tests - there is nothing that covers the callee side. However, that is definitely a job for another PR, in particular one that is going to fix function prologues and epilogues, so that they deal only with the lower 64 bits of the SIMD & FP registers. At the very least we should have a function with a chain of operations such as |
This PR updates the AArch64 ABI implementation so that it (i) properly respects that v8-v15 inclusive have callee-save lower halves, and caller-save upper halves, by conservatively approximating (to full registers) in the appropriate directions when generating prologue caller-saves and when informing the regalloc of clobbered regs across callsites. In order to prevent saving all of these vector registers in the prologue of every non-leaf function due to the above approximation, this also makes use of a new regalloc.rs feature to exclude call instructions' writes from the clobber set returned by register allocation. This is safe whenever the caller and callee have the same ABI (because anything the callee could clobber, the caller is allowed to clobber as well without saving it in the prologue). Fixes bytecodealliance#2254.
e056e1e to
71768bb
Compare
|
@akirilov-arm good point; thanks! I just updated the PR to include a test ( |
|
@cfallin Yes, and it is simpler than what I had in mind, which is even better. I think that together with the other tests it will be a good exercise for an optimal with respect to the AAPCS64 implementation (trying to be a little bit forward-thinking here), and yet it demonstrates the current issues that are simpler, namely handling the full registers and the lack of paired loads and stores. |
This PR updates the AArch64 ABI implementation so that it (i) properly
respects that v8-v15 inclusive have callee-save lower halves, and
caller-save upper halves, by conservatively approximating (to full
registers) in the appropriate directions when generating prologue
caller-saves and when informing the regalloc of clobbered regs across
callsites.
In order to prevent saving all of these vector registers in the prologue
of every non-leaf function due to the above approximation, this also
makes use of a new regalloc.rs feature to exclude call instructions'
writes from the clobber set returned by register allocation. This is
safe whenever the caller and callee have the same ABI (because anything
the callee could clobber, the caller is allowed to clobber as well
without saving it in the prologue).
Fixes #2254.