AI RESEARCH
GEAR: Granularity-Adaptive Advantage Reweighting for LLM Agents via Self-Distillation
arXiv CS.AI
•
ArXi:2605.11853v1 Announce Type: cross Reinforcement learning has become a widely used post-