Cranelift: ISLE: aarch64: `imm12_from_negated_value` rules don't apply for i32, i16

Our prototype verifier seems to have found a completeness/performance (not correctness) bug in the handling of some narrow immediate values in `aarch64`. 

`aarch64` has the following rules that check if the _negated_ value of a constant fits in an immediate (`Imm12`), doing a corresponding switch between add and subtract:

```
;; Same as the previous special case, except we can switch the subtraction to an
;; addition if the negated immediate fits in 12 bits.
(rule 2 (lower (has_type (fits_in_64 ty) (isub x (imm12_from_negated_value y))))
      (add_imm ty x y)) 
      
;; Same as the previous special cases, except we can switch the addition to a
;; subtraction if the negated immediate fits in 12 bits.
(rule 2 (lower (has_type (fits_in_64 ty) (iadd x (imm12_from_negated_value y))))
      (sub_imm ty x y))

(rule 3 (lower (has_type (fits_in_64 ty) (iadd (imm12_from_negated_value x) y)))
      (sub_imm ty y x))
```

These rules should work for all `fits_in_64` types (i8, i16, i32, i64), but they seem to only work as intended for i64 (they aren't stricly needed for i8, since the versions of these rules without negation trigger there since i8 values always fit in `Imm12`).

In particular, our prototype verifier found that for i16 and i32, these rules _only_ apply if the operand extracted with `imm12_from_negated_value` is exactly 0, rather than values in the range Cranelift lowers to `Imm12`: constants with negated values `val`:  `(val < 0xfff) || (val < 0xfff_000 && (val & 0xfff == 0))`. 

For example, adding a constant -1 to a value should always be able to be lowered to subtracting an immediate 1 from the value, but as shown in the test cases below, clif only uses `sub` for i8 and i64 in this case and otherwise keeps a `mov` and `add`. 

### What I think is the bug

As discussed in https://github.com/bytecodealliance/wasmtime/issues/5700, Clif semantics are that narrow (i8, i64, i32) `iconst` are stored in an underlying `u64` that is zero-extended, not sign-extended, from the narrow bits. 

`imm12_from_negated_value` negates this zero-extended value as an i64 then checks whether that fits into an immediate:
```rust
fn imm12_from_negated_u64(&mut self, n: u64) -> Option<Imm12> {
    Imm12::maybe_from_u64((n as i64).wrapping_neg() as u64)
}
```

Any narrow negative constant, after being zero extended, has left-filled zeros, so negating the value no longer produces a constant that is unsigned `< 0xfff` or `< 0xfff_000`. 

When handling narrow iconst values, `imm12_from_negated_value` should negate the narrow value then zero extend in this case. In general, ISLE terms operating on the `u64` may need to see the `u32`, `u16`, etc, instead.  I'll work on a PR for at least fixing `imm12_from_negated_value`.

### `.clif` Test Case

`imm12-64.clif`
```
test interpret
test run
target aarch64

function %a(i64) -> i64 {
block0(v0: i64):
    v1 = iconst.i64 -1
    v3 = iadd v0, v1
    return v3
}
; run: %a(0) == -1
```

Because the `.clif` parser currently sign-extends instead of zero-extends values, for narrow constants I use the decimal -1 value as an unsigned u64 to construct the iconst:

`imm12-32.clif`
```
test interpret
test run
target aarch64

function %a(i32) -> i32 {
block0(v0: i32):
    v1 = iconst.i32 4294967295
    v3 = iadd v0, v1
    return v3
}
; run: %a(0) == -1
```

### Steps to Reproduce

```
cargo run -- compile --target aarch64 -D imm12-64.clif
cargo run -- compile --target aarch64 -D imm12-32.clif
cargo run -- compile --target aarch64 -D imm12-16.clif
cargo run -- compile --target aarch64 -D imm12-8.clif
```

### Expected Results

This function gets rewritten to:
```
sub     x0, x0, #1
ret
```
in all cases.

### Actual Results

```
cargo run -- compile --target aarch64 -D imm12-64.clif
.byte 0, 4, 0, 209, 192, 3, 95, 214

Disassembly of 8 bytes:
   0:   00 04 00 d1             sub     x0, x0, #1         <-- good use of immediate
   4:   c0 03 5f d6             ret
   
cargo run -- compile --target aarch64 -D imm12-32.clif
.byte 2, 0, 128, 18, 0, 0, 2, 11, 192, 3, 95, 214

Disassembly of 12 bytes:
   0:   02 00 80 12             mov     w2, #-1
   4:   00 00 02 0b             add     w0, w0, w2         <-- no immediate
   8:   c0 03 5f d6             ret

cargo run -- compile --target aarch64 -D imm12-16.clif
.byte 226, 255, 159, 82, 0, 0, 2, 11, 192, 3, 95, 214

Disassembly of 12 bytes:
   0:   e2 ff 9f 52             mov     w2, #0xffff
   4:   00 00 02 0b             add     w0, w0, w2         <-- no immediate
   8:   c0 03 5f d6             ret
cargo run -- compile --target aarch64 -D imm12-8.clif
.byte 0, 252, 3, 17, 192, 3, 95, 214

Disassembly of 8 bytes:
   0:   00 fc 03 11             add     w0, w0, #0xff      <-- good use of immediate because #0xff is small
   4:   c0 03 5f d6             ret
```

### Versions and Environment

Cranelift version or commit: c4a2c1e81893a41b363ba135b9fd79fb77eb0531

Operating system: macOS

Architecture: aarch64

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cranelift: ISLE: aarch64: `imm12_from_negated_value` rules don't apply for i32, i16 #5903

What I think is the bug

`.clif` Test Case

Steps to Reproduce

Expected Results

Actual Results

Versions and Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Cranelift: ISLE: aarch64: imm12_from_negated_value rules don't apply for i32, i16 #5903

Description

What I think is the bug

.clif Test Case

Steps to Reproduce

Expected Results

Actual Results

Versions and Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Cranelift: ISLE: aarch64: `imm12_from_negated_value` rules don't apply for i32, i16 #5903

`.clif` Test Case