Reinforcement Learning for Compositional Generalization with Outcome-Level Optimization
arXiv:2605.04920v1 Announce Type: new
Abstract: Compositional generalization refers to correctly interpret novel combinations of known primitives, which remains a major challenge. Existing approaches often rely on supervised fine-tuning, which encoura…