cs.CV, cs.LG

Teaching Metric Distance to Discrete Autoregressive Language Models

arXiv:2503.02379v5 Announce Type: replace
Abstract: Large language models (LLMs) operate as autoregressive predictors over discrete token vocabularies, a formulation that has enabled their adaptation far beyond natural language to vision, robotics, an…