LocalLLaMA

Power-limit vs TG/s for 2×3090

Trying to find the sweet-spot to tradeoff between power and tg/s. 250W seems to be a sweet spot for Qwen3.6-27B. It's interesting that I got higher tg/s at 275W for 1 concurrent request VLLM-server-config from tedivm vllm serve /models/Qwen3…