AI RESEARCH
How Transformers Learn to Plan via Multi-Token Prediction
arXiv CS.AI
•
ArXi:2604.11912v1 Announce Type: cross While next-token prediction (NTP) has been the standard objective for