cs.CL

Learning to Reason in Structured In-context Environments with Reinforcement Learning

arXiv:2509.23330v2 Announce Type: replace
Abstract: Large language models (LLMs) have achieved significant advancements in reasoning capabilities through reinforcement learning (RL) via environmental exploration. As the intrinsic properties of the env…