Ruikang Zhang, Shuo Wang, Qi Su

Mechanistic Knobs in LLMs: Retrieving and Steering High-Order Semantic Features via Sparse Autoencoders

Ruikang Zhang, Shuo Wang, Qi Su / April 8, 2026

arXiv:2601.02978v2 Announce Type: replace-cross
Abstract: Recent work in Mechanistic Interpretability (MI) has enabled the identification and intervention of internal features in Large Language Models (LLMs). However, a persistent challenge lies in li…

Author name: Ruikang Zhang, Shuo Wang, Qi Su

Mechanistic Knobs in LLMs: Retrieving and Steering High-Order Semantic Features via Sparse Autoencoders