Prompt-Activation Duality: Improving Activation Steering via Attention-Level Interventions
arXiv:2605.10664v2 Announce Type: replace-cross
Abstract: Activation steering controls language model behavior by adding directions to internal representations at inference time, but standard residual-stream steering can fail in stateful dialogue. We …