AI RESEARCH
Cross-Tokenizer LLM Distillation through a Byte-Level Interface
arXiv CS.CL
•
ArXi:2604.07466v1 Announce Type: new Cross-tokenizer distillation (CTD), the transfer of knowledge from a teacher to a student language model when the two use different tokenizers, remains a largely unsolved problem. Existing approaches rely on heuristic strategies to align mismatched vocabularies,