cs.IR, cs.LG

Additive Control Variates Dominate Self-Normalisation in Off-Policy Evaluation

arXiv:2602.14914v2 Announce Type: replace
Abstract: Off-policy evaluation (OPE) is essential for assessing ranking and recommendation systems without costly online interventions. Self-Normalised Inverse Propensity Scoring (SNIPS) is a standard tool fo…