cs.AI, cs.CL, cs.LG

Neurons Speak in Ranges: Breaking Free from Discrete Neuronal Attribution

arXiv:2502.06809v3 Announce Type: replace
Abstract: Pervasive polysemanticity in large language models (LLMs) undermines discrete neuron-concept attribution, posing a significant challenge for model interpretation and control. We systematically analyz…