cs.CL, cs.LG, stat.ML

Task Vector Geometry Underlies Dual Modes of Task Inference in Transformers

arXiv:2605.03780v1 Announce Type: cross
Abstract: Transformers are effective at inferring the latent task from context via two inference modes: recognizing a task seen during training, and adapting to a novel one. Recent interpretability studies have …