Voxtral TTS: open-weight model for natural, expressive, and ultra-fast text-to-speech

r/StableDiffusion
Generative AI Open Source AI

Highlights. Realistic, emotionally expressive speech in 9 popular languages with for diverse dialects. Very low latency for time-to-first-audio. Easily adaptable to new voices. Enterprise-grade text-to-speech, powering critical voice agent workflows. submitted by /u/fruesome [link] [comments]