When (and How) to Trust the Expert: Diagnosing Query-Time Expert-Guided Reinforcement Learning
arXiv:2605.09109v1 Announce Type: new
Abstract: Many continuous-control problems ship with a competent but suboptimal controller (a tuned PID, a hand-designed gait). A growing family of methods uses such controllers as queryable experts during RL, but…