cs.AI, cs.CV

Interpreting Video Representations with Spatio-Temporal Sparse Autoencoders

arXiv:2604.03919v1 Announce Type: new
Abstract: We present the first systematic study of Sparse Autoencoders (SAEs) on video representations. Standard SAEs decompose video into interpretable, monosemantic features but destroy temporal coherence: hard …