Name and Version
.
Operating systems
Linux
Which llama.cpp modules do you know to be affected?
No response
Command line
Problem description & steps to reproduce
Regression point a5251ca (#17996).
before:
./build/bin/llama-bench --threads 12 --device none --model qwen3_next_80b_a3b_instruct-iq4_nl.gguf
| model | size | params | backend | ngl | dev | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | ------------ | --------------: | -------------------: |
| qwen3next 80B.A3B IQ4_NL - 4.5 bpw | 41.98 GiB | 79.67 B | CUDA | 99 | none | pp512 | 64.40 ± 3.75 |
| qwen3next 80B.A3B IQ4_NL - 4.5 bpw | 41.98 GiB | 79.67 B | CUDA | 99 | none | tg128 | 6.65 ± 0.01 |
build: fb644247d (7431)
after:
./build/bin/llama-bench --threads 12 --device none --model qwen3_next_80b_a3b_instruct-iq4_nl.gguf
| model | size | params | backend | ngl | dev | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | ------------ | --------------: | -------------------: |
| qwen3next 80B.A3B IQ4_NL - 4.5 bpw | 41.98 GiB | 79.67 B | CUDA | 99 | none | pp512 | 67.14 ± 0.92 |
| qwen3next 80B.A3B IQ4_NL - 4.5 bpw | 41.98 GiB | 79.67 B | CUDA | 99 | none | tg128 | 5.02 ± 0.00 |
build: a5251ca11 (7432)
AMD Ryzen 9 5900X, RAM clocked at 2133 MHz
First Bad Commit
No response
Relevant log output
Name and Version
.
Operating systems
Linux
Which llama.cpp modules do you know to be affected?
No response
Command line
Problem description & steps to reproduce
Regression point a5251ca (#17996).
before:
after:
AMD Ryzen 9 5900X, RAM clocked at 2133 MHz
First Bad Commit
No response
Relevant log output