StarDrinks: An English and Korean Test Set for SLU Evaluation in a Drink Ordering Scenario

ArXi:2604.26500v1 Announce Type: new LLMs and speech assistants are increasingly used for task-oriented interactions, yet their evaluation often relies on controlled scenarios that fail to capture the variability and complexity of real user requests. Drink ordering, for example, involves diverse named entities, drink types, sizes, customizations, and brand-specific terminology, as well as spontaneous speech phenomena such as hesitations and self-corrections. To address this gap, we