AI RESEARCH

GRPO-TTA: Test-Time Visual Tuning for Vision-Language Models via GRPO-Driven Reinforcement Learning

arXiv CS.LG

ArXi:2605.03403v1 Announce Type: cross Group Relative Policy Optimization (GRPO) has recently shown strong performance in post-