Learning to Learn-at-Test-Time: Language Agents with Learnable Adaptation Policies

ArXi:2604.00830v1 Announce Type: cross Test-Time Learning (TTL) enables language agents to iteratively refine their performance through repeated interactions with the environment at inference time. At the core of TTL is an adaptation policy that updates the actor policy based on experience from previous episodes, thereby improving future behavior. Existing methods rely on fixed, hand-crafted adaptation policies rather than optimizing them for downstream improvement.