AI RESEARCH

Beyond Chain-of-Thought: Rewrite as a Universal Interface for Generative Multimodal Embeddings

arXiv CS.CV

ArXi:2604.22280v1 Announce Type: new Multimodal Large Language Models (MLLMs) have emerged as a promising foundation for universal multimodal embeddings. Recent studies have shown that reasoning-driven generative multimodal embeddings can outperform discriminative embeddings on several embedding tasks. However, Chain-of-Thought (CoT) reasoning tends to generate redundant thinking steps and