Vikram Krishnamurthy, Luke Snow

Malliavin Calculus for Counterfactual Gradient Estimation in Adaptive Inverse Reinforcement Learning

Vikram Krishnamurthy, Luke Snow / April 3, 2026

arXiv:2604.01345v1 Announce Type: new
Abstract: Inverse reinforcement learning (IRL) recovers the loss function of a forward learner from its observed responses adaptive IRL aims to reconstruct the loss function of a forward learner by passively obser…

Author name: Vikram Krishnamurthy, Luke Snow

Malliavin Calculus for Counterfactual Gradient Estimation in Adaptive Inverse Reinforcement Learning