cs.LG

Malliavin Calculus for Counterfactual Gradient Estimation in Adaptive Inverse Reinforcement Learning

arXiv:2604.01345v1 Announce Type: new
Abstract: Inverse reinforcement learning (IRL) recovers the loss function of a forward learner from its observed responses adaptive IRL aims to reconstruct the loss function of a forward learner by passively obser…