how i can improve inference speed
specs : core i5 14400F 32gb ram d4 3200mhz rtx 4060 current speeds 30tps in output 500 tps in prefill command i currently use .\llama-server.exe ` >> -m "H:\model\unsloth\Qwen3.6-35B-A3B-GGUF\Qwen3.6-35B-A3B-UD-Q4_K_XL.gguf" ` >…