AI RESEARCH

Multi-Token Prediction via Self-Distillation

arXiv CS.LG

ArXi:2602.06019v2 Announce Type: replace-cross Existing techniques for accelerating language model inference, such as speculative decoding, require