Author name: Gregory N. Frank

How Alignment Routes: Localizing, Scaling, and Controlling Policy Circuits in Language Models

Gregory N. Frank / April 8, 2026

arXiv:2604.04385v2 Announce Type: replace-cross
Abstract: This paper identifies a recurring sparse routing mechanism in alignment-trained language models: a gate attention head reads detected content and triggers downstream amplifier heads that boost …

cs.AI, cs.CL, cs.LG

How Alignment Routes: Localizing, Scaling, and Controlling Policy Circuits in Language Models

Gregory N. Frank / April 7, 2026

arXiv:2604.04385v1 Announce Type: cross
Abstract: We identify a recurring sparse routing mechanism in alignment-trained language models: a gate attention head reads detected content and triggers downstream amplifier heads that boost the signal toward …