Learning Discrete Autoregressive Priors with Wasserstein Gradient Flow
arXiv:2605.06148v1 Announce Type: cross
Abstract: Discrete image tokenizers are commonly trained in two stages: first for reconstruction, and then with a prior model fitted to the frozen token sequences. This decoupling leaves the tokenizer unaware of…