AI RESEARCH

Teaching Metric Distance to Discrete Autoregressive Language Models

arXiv CS.LG

ArXi:2503.02379v5 Announce Type: replace Large language models (LLMs) operate as autoregressive predictors over discrete token vocabularies, a formulation that has enabled their adaptation far beyond natural language to vision, robotics, and multimodal reasoning. However