Harshavardhan - Provide.ai

Feature Rivalry in Sparse Autoencoder Representations: A Mechanistic Study of Uncertainty-Driven Feature Competition in LLMs

Harshavardhan / May 12, 2026

arXiv:2605.08149v1 Announce Type: new
Abstract: Sparse Autoencoders (SAEs) decompose large language model representations into interpretable features, but how these features interact under uncertainty remains poorly understood. We introduce Feature Ri…

Author name: Harshavardhan

Feature Rivalry in Sparse Autoencoder Representations: A Mechanistic Study of Uncertainty-Driven Feature Competition in LLMs