Same 9B Qwen weights: 19.1% in Aider vs 45.6% with a scaffold adapted to small local models
I spent the past week testing a simple question: Small local models often look weak inside coding agents. But how much of that is actually model weakness, and how much is scaffold mismatch? So I held the model fixed and changed only the scaffold. Same …