MIXAR: Scaling Autoregressive Pixel-based Language Models to Multiple Languages and Scripts
arXiv:2604.11575v1 Announce Type: new
Abstract: Pixel-based language models are gaining momentum as alternatives to traditional token-based approaches, promising to circumvent tokenization challenges. However, the inherent perceptual diversity across …