Distributional Off-Policy Evaluation with Deep Quantile Process Regression
arXiv:2604.18143v1 Announce Type: new
Abstract: This paper investigates the off-policy evaluation (OPE) problem from a distributional perspective. Rather than focusing solely on the expectation of the total return, as in most existing OPE methods, we …