LocalLLaMA

Qwen3.5-27B on RTX 5090 served via vLLM @ 77 tps

After maxing out my cursor $20 sub and zai $10 sub for this month, I have resorted to a local llm setup. Got good outcome on RTX5090 running Qwen3.5 27B and achieved very good tps. Context window at 218k. It can even run 2 concurrent sessions with this…