TabEmbed: Benchmarking and Learning Generalist Embeddings for Tabular Understanding

ArXi:2605.04962v1 Announce Type: new Foundation models have established unified representations for natural language processing, yet this paradigm remains largely unexplored for tabular data. Existing methods face fundamental limitations: LLM-based approaches lack retrieval-compatible vector outputs, whereas text embedding models often fail to capture tabular structure and numerical semantics. To bridge this gap, we first