AI RESEARCH
On Information Self-Locking in Reinforcement Learning for Active Reasoning of LLM agents
arXiv CS.AI
•
ArXi:2603.12109v1 Announce Type: new Reinforcement learning (RL) with outcome-based rewards has achieved significant success in