cs.LG

Graph Learning Is Suboptimal in Causal Bandits

arXiv:2510.16811v3 Announce Type: replace
Abstract: We study regret minimization in causal bandits under causal sufficiency where the underlying causal structure is not known to the agent. Previous work has focused on identifying the reward’s parents …