Pref-CTRL: Preference Driven LLM Alignment using Representation Editing
arXiv:2604.23543v1 Announce Type: new
Abstract: Test-time alignment methods offer a promising alternative to fine-tuning by steering the outputs of large language models (LLMs) at inference time with lightweight interventions on their internal represe…