AI RESEARCH

SommBench: Assessing Sommelier Expertise of Language Models

arXiv CS.AI

ArXi:2603.12117v1 Announce Type: cross With the rapid advances of large language models, it becomes increasingly important to systematically evaluate their multilingual and multicultural capabilities. Previous cultural evaluation benchmarks focus mainly on basic cultural knowledge that can be encoded in linguistic form. Here, we propose SommBench, a multilingual benchmark to assess sommelier expertise, a domain deeply grounded in the senses of smell and taste.