Change the default FMA codegen to be of 231 form instead of 213 form#25387
Change the default FMA codegen to be of 231 form instead of 213 form#25387hanblee wants to merge 1 commit intodotnet:masterfrom
Conversation
since it is likely to be most prevalent. E.g.,x = Fma.MultiplyAdd(a, b, x);
|
@tannergooding PTAL |
|
@hanblee, do you have any numbers/metrics showing that I had done some initial analysis of various C/C++ algorithms and raw assembly using I would much rather see this fixed, longer term, by having the register allocator support preferencing better and for it to also support indicating more than one operand can be reg optional. Also CC. @CarolEidt |
|
For this case in particular, if none of the operands can be contained, it would be ideal if we could choose the encoding based on whether the target register matches one of the input registers. |
|
Here is one view based on GitHub search results for the following four columns. 231 form is slightly (~10%) more common than 213 form.
|
| op1Reg = op1->gtRegNum; | ||
| op2Reg = op2->gtRegNum; | ||
|
|
||
| isCommutative = !copiesUpperBits; |
There was a problem hiding this comment.
I'm pretty sure we need to keep the isCommutative handling for the cases where the multiplicand and multiplier can be swapped.
| { | ||
| // 213 form: op1 = (op2 * op1) + op3 | ||
|
|
||
| if (copiesUpperBits) |
There was a problem hiding this comment.
Similar comment, pretty sure we need to preserve this logic since op1 impacts determinism.
There was a problem hiding this comment.
For clarity as well, I think it makes sense to continue to keep the reg-only case separate, even if it's using the same form as the case above.
| { | ||
| // 213 form: op1 = (op2 * op1) + op3 | ||
|
|
||
| if (copiesUpperBits) |
There was a problem hiding this comment.
For clarity as well, I think it makes sense to continue to keep the reg-only case separate, even if it's using the same form as the case above.
|
I'm not sure this is adequately motivated.
Supporting multiple reg-optional operands is a rather complex change (though it would have broad benefit), so it is not likely to happen soon. |
|
Thanks for your comments, and I agree on the need for a more comprehensive solutions. Closing this as a result. |
|
Thanks @hanblee for raising this issue. I've filed https://github.com/dotnet/coreclr/issues/25434 to make the improvements that have been discussed. |
as it is likely to be most prevalent. E.g.,
x = Fma.MultiplyAdd(a, b, x), and it removes the need for the redundant register to register move.Current codegen for the following code:
is
With this PR: