AI RESEARCH

Token Distillation: Attention-aware Input Embeddings For New Tokens

arXiv CS.LG

ArXi:2505.20133v3 Announce Type: replace-cross Current language models rely on static vocabularies determined at pre