Contrastive Semantic Projection: Faithful Neuron Labeling with Contrastive Examples
arXiv:2604.22477v2 Announce Type: replace
Abstract: Neuron labeling assigns textual descriptions to internal units of deep networks. Existing approaches typically rely on highly activating examples, often yielding broad or misleading labels by focusin…