Task Vectors, Learned Not Extracted: Performance Gains and Mechanistic Insight

ArXi:2509.24169v2 Announce Type: replace Large Language Models (LLMs) can perform new tasks from in-context nstrations, a phenomenon known as in-context learning (ICL). Recent work suggests that these nstrations are compressed into task vectors (TVs), compact task representations that LLMs exploit for predictions. However, prior studies typically extract TVs from model outputs or hidden states using cumbersome and opaque methods, and they rarely elucidate the mechanisms by which TVs influence computation. In this work, we address both limitations. First, we propose directly