SAVGO: Learning State-Action Value Geometry with Cosine Similarity for Continuous Control

ArXi:2605.00787v1 Announce Type: new While representation and similarity learning have improved the sample efficiency of Reinforcement Learning (RL), they are rarely used to shape policy updates directly in the action space. To bridge this gap, a geometry-aware RL algorithm that explicitly incorporates value-based similarity into the policy update, State-Action Value Geometry Optimization (SAVGO), is proposed.