cs.AI, cs.CL, cs.GT

Peer-Predictive Self-Training for Language Model Reasoning

arXiv:2604.13356v1 Announce Type: cross
Abstract: Mechanisms for continued self-improvement of language models without external supervision remain an open challenge. We propose Peer-Predictive Self-Training (PST), a label-free fine-tuning framework in…