Rachel Freedman, Justin Svegliato, Kyle Wray, Stuart Russell

Active teacher selection for reward learning

Rachel Freedman, Justin Svegliato, Kyle Wray, Stuart Russell / May 11, 2026

arXiv:2310.15288v3 Announce Type: replace
Abstract: Reward learning techniques enable machine learning systems to learn objectives from human feedback. A core limitation of these systems is their assumption that all feedback comes from a single human …

Author name: Rachel Freedman, Justin Svegliato, Kyle Wray, Stuart Russell

Active teacher selection for reward learning