cs.AI, cs.LG, cs.SY, eess.SY, math.OC, stat.ML

Local Linearity of LLMs Enables Activation Steering via Model-Based Linear Optimal Control

arXiv:2604.19018v1 Announce Type: new
Abstract: Inference-time LLM alignment methods, particularly activation steering, offer an alternative to fine-tuning by directly modifying activations during generation. Existing methods, however, often rely on n…