/u/Exciting-Camera3226

Local model on coding has reached a certain threshold to be feasible for real work

/u/Exciting-Camera3226 / April 28, 2026

We ran open-weight 27B–32B models on Terminal-Bench 2.0 (89 tasks, terminal-bench-2.git @ 69671fb) through our agent harness. Best result was Qwen 3.6-27B at 38.2% (34/89) under the default per-task timeout — the same constraint the public leader…

Author name: /u/Exciting-Camera3226

Local model on coding has reached a certain threshold to be feasible for real work