AI RESEARCH

Restoring Exploration after Post-Training: Latent Exploration Decoding for Large Reasoning Models

arXiv CS.LG

ArXi:2602.01698v2 Announce Type: replace-cross Large Reasoning Models (LRMs) have recently achieved strong mathematical and code reasoning performance through Reinforcement Learning (RL) post-