The Qwen 3.6 35B A3B hype is real!!!

My personal test for small local LLM intelligence is to check whether a model has any ability to understand the code that I write for my own academic research. My research is on some pretty niche topics and I doubt that anything like it is substantively present in the training sets for LLMs. A few months ago, small local models' ability to understand my code was nominal at best with Devstral Small 2 being the top performer. However, several small open weight models now have methods of accommodating fairly long contexts (gated delta net, hybrid Mamba2, sliding window attention) which makes them extremely smarter. I can now feed a model an entire academic paper along with accompanying code and ask it to use the paper to work out what the code is doing.

I just spent a couple days experimenting with:

  • Qwen 3.6 35B A3B
  • Qwen 3.6 27B
  • Gemma 4 26B A4B
  • Nemotron 3 Nano

All of them were able to comprehend my code significantly better than what any small local model could do a few months ago. I did try Devstral Small 2 since I recently went from a single 16GB graphics card to two; however, I simply couldn't fit the long context in 32GB of ram. I hope Mistral releases a new small model with a gated delta net, because I think it could take the throne.

These are my detailed findings from asking local models to explain how my code maps to the research paper it corresponds to.

TLDR: All four models listed above are incredibly capable local models, with Qwen 3.6 35B A3B standing out as the best. I'm also inclined to think that an intelligent human with any of these four models is more capable than something like Opus 4.7 on its own (see the detailed findings).

Please let me know your thoughts!

submitted by /u/The_Paradoxy
[link] [comments]

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top