AI RESEARCH
[D] Lossless tokenizers lose nothing and add nothing — trivial observation or worth formalizing?
r/MachineLearning
•
I wrote up a short information-theoretic argument for why lossless tokenization neither restricts the expressiveness of language models nor