Layerwise Dynamics for In-Context Classification in Transformers
arXiv:2604.11613v2 Announce Type: replace
Abstract: Transformers can perform in-context classification from a few labeled examples, yet the inference-time algorithm remains opaque. We study multi-class linear classification in the hard no-margin regim…