| Qwen 3.6 27b vs Codex GPT 5.5 / Claude Opus 4.7 My local llm discovered a bug that they both missed And it turns out it's critical GPT 5.5 and Claude both stood their ground and didn't give up until the end - they claimed to be right all along. I told my Qwen to provide detailed proof of his arguments, brought the evidance to both of them, and only then came their admission. Qwen 3.6 27b thinks a lot. That can be both a good and a bad thing. In this case, the long thinking actually discovered a bug neither of the frontier models couldn't find. GPT 5.5 is FAST. Really fast. But in reality as I found out, it comes with a big tradeoff. [link] [comments] |