cs.AI, cs.CL, cs.LG

Non-linear Interventions on Large Language Models

arXiv:2605.14749v1 Announce Type: cross
Abstract: Intervention is one of the most representative and widely used methods for understanding the internal representations of large language models (LLMs). However, existing intervention methods are confine…