Do Not Imitate, Reinforce: Iterative Classification via Belief Refinement
arXiv:2604.22110v1 Announce Type: new
Abstract: Standard supervised classification trains models to imitate the exact labels provided by a perfect oracle. This imitation happens in a single pass, restricting the model to a fixed compute budget even wh…