Soft Head Selection for Injecting ICL-Derived Task Embeddings

ArXi:2507.20906v3 Announce Type: replace Large language models (LLMs) are commonly adapted to downstream tasks using parameter-efficient fine-tuning (PEFT) or in-context learning (ICL). Recently, ICL-driven embedding-based adaptation has been proposed as a distinct task adaptation paradigm. It derives task-specific embeddings from intermediate activations using few-shot prompts and injects them during inference. Despite its conceptual appeal, this approach has not nstrated consistent performance gains over PEFT or ICL, and its empirical advantages have been limited in practice.