An Information-Theoretic Approach to Understanding Transformers’ In-Context Learning of Variable-Order Markov Chains
arXiv:2410.05493v3 Announce Type: replace
Abstract: We study transformers’ in-context learning of variable-length Markov chains (VOMCs), focusing on the finite-sample accuracy as the number of in-context examples increases. Compared to fixed-order Mar…