AI RESEARCH
Token Distillation: Attention-aware Input Embeddings For New Tokens
arXiv CS.LG
•
ArXi:2505.20133v3 Announce Type: replace-cross Current language models rely on static vocabularies determined at pre