How speech models fail where it matters the most and what to do about it

Together AI Blog
Generative AI LLMs

State-of-the-art speech models like Whisper and Deepgram score near-human on benchmarks - then fail 39% of the time on street names. New research from Together AI exposes the gap and a fix.