Lever: Inference-Time Policy Reuse under Support Constraints
arXiv:2604.20174v1 Announce Type: new
Abstract: Reinforcement learning (RL) policies are typically trained for fixed objectives, making reuse difficult when task requirements change. We study inference-time policy reuse: given a library of pre-trained…