Diversity or Precision? A Deep Dive into Next Token Prediction

ArXi:2512.22955v4 Announce Type: replace Recent advancements have shown that reinforcement learning (RL) can substantially improve the reasoning abilities of large language models (LLMs). The effectiveness of such RL