artificial

scalar-loop: a Python harness for Karpathy’s autoresearch pattern that doesn’t trust the agent’s narration

I built scalar-loop to solve one problem: LLM agents game their verifiers. The pattern is Karpathy's autoresearch loop. LLM proposes an edit, harness runs the metric, loop keeps or reverts based on the number. Simple. Until you watch the agent, on …