cs.LG, cs.SY, eess.SY

Transformer-like Inference from Optimal Control

arXiv:2605.15608v1 Announce Type: new
Abstract: Decoder-only transformers compute the conditional probability of the next token from a sequence of past observations. This paper derives, from first principles, inference architectures that solve the sam…