Tianyi Ma, Tengyao Wang, Richard J. Samworth

Optimal In-context Adaptivity and Distributional Robustness of Transformers

Tianyi Ma, Tengyao Wang, Richard J. Samworth / May 8, 2026

arXiv:2510.23254v3 Announce Type: replace
Abstract: We study in-context learning problems where a Transformer is pretrained on tasks drawn from a mixture distribution $\pi=\sum_{\alpha\in\mathcal{A}} \lambda_{\alpha} \pi_{\alpha}$, called the pretrain…

Author name: Tianyi Ma, Tengyao Wang, Richard J. Samworth

Optimal In-context Adaptivity and Distributional Robustness of Transformers