cs.AI, cs.CL, cs.LG, stat.ML

Painless Activation Steering: An Automated, Lightweight Approach for Post-Training Large Language Models

arXiv:2509.22739v3 Announce Type: replace
Abstract: Language models (LMs) are typically post-trained for desired capabilities and behaviors via weight-based or prompt-based steering, but the former is time-consuming and expensive, and the latter is no…