Yucheng Shi, Zhenwen Liang, Kishan Panaganti, Dian Yu, Wenhao Yu, Haitao Mi

Learning to Build the Environment: Self-Evolving Reasoning RL via Verifiable Environment Synthesis

Yucheng Shi, Zhenwen Liang, Kishan Panaganti, Dian Yu, Wenhao Yu, Haitao Mi / May 15, 2026

arXiv:2605.14392v1 Announce Type: new
Abstract: We pursue a vision for self-improving language models in which the model does not merely generate problems or traces to imitate, but constructs the environments that train it. In zero-data reasoning RL, …

Author name: Yucheng Shi, Zhenwen Liang, Kishan Panaganti, Dian Yu, Wenhao Yu, Haitao Mi

Learning to Build the Environment: Self-Evolving Reasoning RL via Verifiable Environment Synthesis