Enable simd_extmul_* for AArch64#3070
Conversation
00b264c to
ff2cd0f
Compare
|
@sparker-arm @akirilov-arm I'm currently out on vacation until Mon Jul 26; I can review this when I'm back if you prefer, or perhaps one of @abrown or @alexcrichton could review? |
|
@cfallin I pinged you because historically you had done most of the AArch64-related reviews (and Sam was actually unable to request a review from anybody), but another reviewer would be absolutely fine. |
alexcrichton
left a comment
There was a problem hiding this comment.
Looks pretty reasonable to me!
| Umull32, | ||
| /// Unsigned multiply add long | ||
| Umlal8, | ||
| Umlal16, |
There was a problem hiding this comment.
It seems like this and Umlal8 aren't actually generated in any lowerings (unless I'm missing something), but do you want to leave these in for completeness with possible future lowerings?
There was a problem hiding this comment.
Yes, exactly - thought it would be weird to just port the 32-bit version.
| ra: zero_reg(), | ||
| }); | ||
| } else if ty.is_vector() { | ||
| for ext_op in &[ |
There was a problem hiding this comment.
One possible way to structure this is to perhaps check for is_vector and i64x2 early on at the top of this block with early-returns if they match, and if that fails all the remaining cases share the logic of
let rn = put_input_in_reg(ctx, inputs[0], NarrowValueMode::None);
let rm = put_input_in_reg(ctx, inputs[1], NarrowValueMode::None);
let rd = get_output_reg(ctx, outputs[0]).only_reg().unwrap();as a prefix.
Just a thought though, happy to defer to you who work on this much more than I!
There was a problem hiding this comment.
So even though I don't like the duplication, or how much conditional stuff is going on here, we would only be able to factor out one collection of these calls as the I128 case uses, ever so slightly, different calls to handle multiple registers. So I think I prefer the clarity.
cfallin
left a comment
There was a problem hiding this comment.
LGTM, thanks! Just one request for a doc-comment below.
| None | ||
| } | ||
|
|
||
| pub(crate) fn match_vec_long_mul<C: LowerCtx<I = Inst>>( |
There was a problem hiding this comment.
Can we add a doc-comment here describing the return tuple? E.g. it's not clear to me reading at this point what the bool denotes, until I notice the low/high pattern below in the implementation.
Lower simd_extmul_[low/high][signed/unsigned] to [s|u]widen inputs to an imul node. Copyright (c) 2021, Arm Limited.
Copyright (c) 2021, Arm Limited.
And removed an accidental code move. Copyright (c) 2021, Arm Limited.
ff2cd0f to
5eb2dca
Compare
Lower simd_extmul_[low/high][signed/unsigned] to [s|u]widen inputs to
an imul node.
This also reorganises the aarch64 'long mul' operations and their assembly.
Copyright (c) 2021, Arm Limited.