Xinshun Feng, Xinhao Song, Lijun Li, Gongshen Liu, Jing Shao

SEARL: Joint Optimization of Policy and Tool Graph Memory for Self-Evolving Agents

Xinshun Feng, Xinhao Song, Lijun Li, Gongshen Liu, Jing Shao / April 14, 2026

arXiv:2604.07791v2 Announce Type: replace-cross
Abstract: Recent advances in Reinforcement Learning with Verifiable Rewards (RLVR) have demonstrated significant potential in single-turn reasoning tasks. With the paradigm shift toward self-evolving age…

Author name: Xinshun Feng, Xinhao Song, Lijun Li, Gongshen Liu, Jing Shao

SEARL: Joint Optimization of Policy and Tool Graph Memory for Self-Evolving Agents