Causal Bandit Over Unknown Graphs: Upper Confidence Bounds With Backdoor Adjustment
arXiv:2502.02020v3 Announce Type: replace
Abstract: The causal bandit problem seeks to identify, through sequential experimentation, an intervention that maximizes the expected reward in a causal system modeled by a directed acyclic graph (DAG). Exist…