| specs : core i5 14400F 32gb ram d4 3200mhz rtx 4060 current speeds 30tps in output 500 tps in prefill command i currently use .\llama-server.exe ` >> -m "H:\model\unsloth\Qwen3.6-35B-A3B-GGUF\Qwen3.6-35B-A3B-UD-Q4_K_XL.gguf" ` >> --host 0.0.0.0 --port 8080 ` >> --alias "claude-sonnet-4-5" ` >> -ngl 999 ` >> --n-cpu-moe 36 ` >> -c 65535 ` >> -b 4096 ` >> -ub 2048 ` >> -t 6 ` >> -tb 10 ` >> --cont-batching ` >> --mlock ` >> -ctk turbo4 -ctv turbo3 ` >> -fa on ` >> --jinja ` >> --warmup ` >> --perf ` [link] [comments] |