From 1939 to voice clones in 3 seconds — the full AI speech timeline and where it's heading
r/LocalLLaMA
•
Generative AI
AI Safety
Dennis Klatt's voice became Stephen Hawking's voice. WaveNet scored 4.21 vs 3.86 for the best previous system. Tacotron 2 hit 4.53 vs 4.58 for real human speech. VALL-E cloned voices from 3 seconds. Microsoft refused to release VALL-E 2. Now Kokoro (82M params, trained for $400) competes with ElevenLabs ($11B). A 2025 Nature study found people rated AI voices as trustworthy than real ones submitted by /u/FunSignificance4405 [link] [comments]