Qwen3 TTS in C++ with 1.7B support, speaker encoding extraction, and desktop UI
r/LocalLLaMA
•
Generative AI
I've spent the last few weekends working on a Qwen3 TTS implementation which is a fork of but with features and cleaner codebase: It currently s: the 1.7B model speaker encoding extraction a JNI interface speaker instructions (custom voice models) voice cloning with both base models (0.6B and 1.7B) I also built a desktop app UI for it using Kotlin Multiplatform: The app must be compiled from source, it works under Windows and Linux. Models still need to be converted to GGUF manually. Both repos are missing a bit of polish. However, it is in a state that I feel comftable posting it here.