AI RESEARCH
Restoring Exploration after Post-Training: Latent Exploration Decoding for Large Reasoning Models
arXiv CS.LG
•
ArXi:2602.01698v2 Announce Type: replace-cross Large Reasoning Models (LRMs) have recently achieved strong mathematical and code reasoning performance through Reinforcement Learning (RL) post-