| I know benchmarks are questionable, imprecise on individual use cases, and LLMs are often trained to excel... But we're not talking numbers here. We're talking about a trend. When I was using GPT 4o or Sonnet 3.7, if you'd told me I could do all those things locally in such a short time, I wouldn't have believed it. Now it's happening. It's not just happening to those with 400GB of VRAM. It's also happening on more affordable hardware. I think if Qwen 3.6 27b actually comes out soon, it will be truly incredible. True: we're seeing licenses changing, and an increasing need for monetization from open source developers. But it's a really great time. Yesterday I completed tasks that I normally couldn't finish without Claude using the odd Qwen 27b + Minimax 2.7 Q4 combo. For those who want GLM 5 Air... Rediscover the 4.7, which is still very good and smaller. This is a chart that answers many questions I read here daily. [link] [comments] |