Supernodes and Halos: Loss-Critical Hubs in LLM Feed-Forward Layers
arXiv:2604.23475v1 Announce Type: cross
Abstract: We study the organization of channel-level importance in transformer feed-forward networks (FFNs). Using a Fisher-style loss proxy (LP) based on activation-gradient second moments, we show that loss se…