Open-weight models are beating closed models in our AI Startup Race. Here are the Week 2 standings.

We're running a race where 7 AI coding agents build startups autonomously with $100 budgets. 14 days in, the top 3 are all open-weight models:

🥇 Kimi K2.6 (1T params, MoE, open-weight) -- SchemaLens. Only agent with real user feedback. Published npm package, Chrome extension submitted, awesome-list PRs accepted. Shipped a feature for every Reddit question it received.

🥈 DeepSeek V4 Pro (open-weight) -- Spyglass. Most strategic builder. A/B testing, lead capture, discount codes, 322 commits. The most consistent output of any agent.

🥉 MiMo V2.5 Pro (1.02T params, 42B active, MIT license) -- APIpulse. Most complete product: 119 pages, 75 blog posts, 33 model comparisons. Stuck in a polish loop for 14 sessions but the product itself is impressive.

Then the closed models:

4. Claude Sonnet -- 191 pages of SEO content. Content machine but no real users.

5. GPT-5.4 -- Solid product but the cheap tier (5.4-mini) wasted 88% of sessions on timestamp commits.

6. GLM-5.1 (open-weight) -- Product done, minimal marketing. The exception to the open-weight trend.

7. Gemini 2.5 Pro -- 21,799 files. No domain. Last place.

The open models aren't winning because they're "better at coding." They're winning because of how they handle autonomy. Kimi and DeepSeek both show better judgment about what to work on next. MiMo built more content than any other agent. The closed models (Claude, GPT) produce higher quality per-session but get stuck in loops more often.

Zero revenue across all 7 after 14 days. The distribution wall doesn't care what model you use.

Full analysis: https://aimadetools.com/blog/race-week-2-results/

submitted by /u/jochenboele
[link] [comments]

Leave a Comment