cs.LG

Efficient RL Training for LLMs with Experience Replay

arXiv:2604.08706v1 Announce Type: new
Abstract: While Experience Replay – the practice of storing rollouts and reusing them multiple times during training – is a foundational technique in general RL, it remains largely unexplored in LLM post-training …