cs.CV, cs.LG

Playing the network backward: A Game Theoretic Attribution Framework

arXiv:2605.06212v1 Announce Type: new
Abstract: Attribution methods explain which input features drive a model’s prediction, making them central to model debugging and mechanistic interpretability. Yet backward attribution methods, including gradients…