Author name: Sarah Liaw, Benjamin Plaut

Learning When Not to Learn: Risk-Sensitive Abstention in Bandits with Unbounded Rewards

Sarah Liaw, Benjamin Plaut / April 14, 2026

arXiv:2510.14884v3 Announce Type: replace
Abstract: In high-stakes AI applications, even a single action can cause irreparable damage. However, nearly all of sequential decision-making theory assumes that all errors are recoverable (e.g., by bounding …

cs.AI, cs.LG

Learning When Not to Learn: Risk-Sensitive Abstention in Bandits with Unbounded Rewards

Sarah Liaw, Benjamin Plaut / March 31, 2026

arXiv:2510.14884v2 Announce Type: replace
Abstract: In high-stakes AI applications, even a single action can cause irreparable damage. However, nearly all of sequential decision-making theory assumes that all errors are recoverable (e.g., by bounding …