Lever: Inference-Time Policy Reuse under Support Constraints
arXiv:2604.20174v2 Announce Type: replace
Abstract: Reinforcement learning (RL) policies are typically trained for fixed objectives, making reuse difficult when task requirements change. We study inference-time policy reuse: given a library of pre-tra…