Bahareh Tolooshams, Ailsa Shen, Anima Anandkumar

Mechanistic Interpretability with Sparse Autoencoder Neural Operators

Bahareh Tolooshams, Ailsa Shen, Anima Anandkumar / May 11, 2026

arXiv:2509.03738v4 Announce Type: replace-cross
Abstract: We introduce sparse autoencoder neural operators (SAE-NOs), a new class of sparse autoencoders that operate in function spaces rather than fixed-dimensional Euclidean representations. We formal…

Author name: Bahareh Tolooshams, Ailsa Shen, Anima Anandkumar

Mechanistic Interpretability with Sparse Autoencoder Neural Operators