Same Signal, Opposite Meaning: Direction-Informed Adaptive Learning for LLM Agents

ArXi:2605.06908v1 Announce Type: cross Adaptive test-time compute for LLM agents aims to invoke extra computation only when it improves performance. Existing methods typically use confidence-, uncertainty-, or difficulty-based gates, assuming a fixed direction from the gating signal through compute need to the value of computation. This makes gating a utility-calibration problem: gating signals should align with whether extra computation improves the final outcome over the base policy.