SpatialReward: Bridging the Perception Gap in Online RL for Image Editing via Explicit Spatial Reasoning

ArXi:2602.07458v3 Announce Type: replace Online Reinforcement Learning (RL) offers a promising avenue for complex image editing but is currently constrained by the scarcity of reliable and fine-grained reward signals. Existing evaluators frequently struggle with a critical perception gap we term "Attention Collapse," where models neglect cross-image comparisons and fail to capture fine-grained details, resulting in inaccurate perception and miscalibrated scores.