Aram Ebtekar, Michael K. Cohen

Golden Handcuffs make safer AI agents

Aram Ebtekar, Michael K. Cohen / April 16, 2026

arXiv:2604.13609v1 Announce Type: cross
Abstract: Reinforcement learners can attain high reward through novel unintended strategies. We study a Bayesian mitigation for general environments: we expand the agent’s subjective reward range to include a la…

Author name: Aram Ebtekar, Michael K. Cohen

Golden Handcuffs make safer AI agents