You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
My server running llama-server and Qwen3.5 (mostly for coding agents) rebooted several times a day since I switched from Qwen3.5-35B-A3B to Qwen3.5-27B. Any suggestions?
I tried to stop services like Ollama, and set a lower GPU power limit with nvidia-smi -pl 300 (which was 350W by default). But as long as 27B was run with CUDA, it would crash and the machine would reboot itself.
My server config:
CPU: dual AMD EPYC 9255 24 core
GPU: dual RTX 3090 with NVLink
OS: Ubuntu 22.04.1 with kernel 6.8.0-52-generic
driver version: 525.183.01
llama-server: b8508, b8580 built with CUDA 12.4.0_550.54.14
arguments: -m Qwen3.5-27B-UD-IQ3_XXS.gguf --mmproj mmproj-F16.gguf --host 0.0.0.0 --fit on
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
My server running llama-server and Qwen3.5 (mostly for coding agents) rebooted several times a day since I switched from Qwen3.5-35B-A3B to Qwen3.5-27B. Any suggestions?
I tried to stop services like Ollama, and set a lower GPU power limit with
nvidia-smi -pl 300(which was 350W by default). But as long as 27B was run with CUDA, it would crash and the machine would reboot itself.My server config:
Beta Was this translation helpful? Give feedback.
All reactions