AI RESEARCH
Beyond Test-Time Training: Learning to Reason via Hardware-Efficient Optimal Control
arXiv CS.LG
•
ArXi:2603.09221v1 Announce Type: new Associative memory has long underpinned the design of sequential models. Beyond recall, humans reason by projecting future states and selecting goal-directed actions, a capability that modern language models increasingly require but do not natively encode. While prior work uses reinforcement learning or test-time