ScaleBox: Enabling High-Fidelity and Scalable Code Verification for Large Language Models
arXiv:2604.27467v1 Announce Type: cross
Abstract: Code sandboxes have emerged as a critical infrastructure for advancing the coding capabilities of large language models, providing verifiable feedback for both RL training and evaluation. However, exis…