Causal2Vec: Improving Decoder-only LLMs as Embedding Models through a Contextual Token

ArXi:2507.23386v3 Announce Type: replace Decoder-only large language models (LLMs) have been increasingly adopted to build embedding models for diverse tasks. To overcome the inherent limitations of causal attention in representation learning, many existing methods modify the attention mechanism to be bidirectional, potentially undermining LLMs' ability to extract semantic information acquired during pre-