Omnivoice - 600+ Language Open-Source TTS with Voice Cloning and Design
r/LocalLLaMA
•
Generative AI
AI Research
OmniVoice is a state-of-the-art zero-shot multilingual TTS model ing than 600 languages. Built on a novel diffusion language model architecture, it generates high-quality speech with superior inference speed, ing voice cloning and voice design. Key Features - 600+ Languages ed: The broadest language coverage among zero-shot TTS models - Voice Cloning: State-of-the-art voice cloning quality. - Voice Design: Control voices via assigned speaker attributes (gender, age, pitch, dialect/accent, whisper, etc.). - Fast Inference: RTF as low as 0.025 (40x faster than real-time.