FormalRewardBench: A Benchmark for Formal Theorem Proving Reward Models
arXiv:2605.10141v1 Announce Type: new
Abstract: Recent neural theorem provers use reinforcement learning with verifiable rewards (RLVR), where proof assistants provide binary correctness signals. While verifiable rewards are cheap and scalable without…