AI RESEARCH

GEAR: Granularity-Adaptive Advantage Reweighting for LLM Agents via Self-Distillation

arXiv CS.AI

ArXi:2605.11853v1 Announce Type: cross Reinforcement learning has become a widely used post-