cs.CV

Structured State-Space Regularization for Compact and Generation-Friendly Image Tokenization

arXiv:2604.11089v1 Announce Type: new
Abstract: Image tokenizers are central to modern vision models as they often operate in latent spaces. An ideal latent space must be simultaneously compact and generation-friendly: it should capture image’s essent…