AI RESEARCH
Multi-Token Prediction via Self-Distillation
arXiv CS.LG
•
ArXi:2602.06019v2 Announce Type: replace-cross Existing techniques for accelerating language model inference, such as speculative decoding, require