Misc. bug:  Qwen3-Next token generation performance regression (CPU-only)

### Name and Version

.

### Operating systems

Linux

### Which llama.cpp modules do you know to be affected?

_No response_

### Command line

```shell

```

### Problem description & steps to reproduce

Regression point a5251ca11d2317d93a7b6da4217483f4e83beb3d (#17996).

before:
```
./build/bin/llama-bench --threads 12 --device none --model qwen3_next_80b_a3b_instruct-iq4_nl.gguf
| model                          |       size |     params | backend    | ngl | dev          |            test |                  t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | ------------ | --------------: | -------------------: |
| qwen3next 80B.A3B IQ4_NL - 4.5 bpw |  41.98 GiB |    79.67 B | CUDA       |  99 | none         |           pp512 |         64.40 ± 3.75 |
| qwen3next 80B.A3B IQ4_NL - 4.5 bpw |  41.98 GiB |    79.67 B | CUDA       |  99 | none         |           tg128 |          6.65 ± 0.01 |

build: fb644247d (7431)
```

after:
```
./build/bin/llama-bench --threads 12 --device none --model qwen3_next_80b_a3b_instruct-iq4_nl.gguf
| model                          |       size |     params | backend    | ngl | dev          |            test |                  t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | ------------ | --------------: | -------------------: |
| qwen3next 80B.A3B IQ4_NL - 4.5 bpw |  41.98 GiB |    79.67 B | CUDA       |  99 | none         |           pp512 |         67.14 ± 0.92 |
| qwen3next 80B.A3B IQ4_NL - 4.5 bpw |  41.98 GiB |    79.67 B | CUDA       |  99 | none         |           tg128 |          5.02 ± 0.00 |

build: a5251ca11 (7432)
```

AMD Ryzen 9 5900X, RAM clocked at 2133 MHz

### First Bad Commit

_No response_

### Relevant log output

```shell

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Misc. bug: Qwen3-Next token generation performance regression (CPU-only) #18112

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Misc. bug: Qwen3-Next token generation performance regression (CPU-only) #18112

Description

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions