Daking Rai, Mor Geva, Ziyu Yao

Data-driven Circuit Discovery for Interpretability of Language Models

Daking Rai, Mor Geva, Ziyu Yao / May 12, 2026

arXiv:2605.09129v1 Announce Type: new
Abstract: Circuit discovery aims to explain how language models (LMs) implement a specific task by localizing and interpreting a circuit, a computational subgraph responsible for the LM’s behavior. Existing circui…

Author name: Daking Rai, Mor Geva, Ziyu Yao

Data-driven Circuit Discovery for Interpretability of Language Models