Ruben Fernandez-Boullon, David N. Olivieri

Patch-Effect Graph Kernels for LLM Interpretability

Ruben Fernandez-Boullon, David N. Olivieri / May 8, 2026

arXiv:2605.06480v1 Announce Type: cross
Abstract: Mechanistic interpretability aims to reverse-engineer transformer computations by identifying causal circuits through activation patching. However, scaling these interventions across diverse prompts an…

Author name: Ruben Fernandez-Boullon, David N. Olivieri

Patch-Effect Graph Kernels for LLM Interpretability