/u/kalpitdixit - Provide.ai

Open-source 9-task benchmark for coding-agent retrieval augmentation. Per-task deltas +0.010 to +0.320, all evals reproducible [P]

/u/kalpitdixit / April 25, 2026

Sharing an open-source benchmark suite (paper-lantern-challenges) that measures coding-agent performance with vs without retrieval-augmented technique selection across 9 everyday software tasks. Disclosure: I'm the author of the retrieval sys…

Author name: /u/kalpitdixit

Open-source 9-task benchmark for coding-agent retrieval augmentation. Per-task deltas +0.010 to +0.320, all evals reproducible [P]