Voxtral WebGPU: Real-time speech transcription entirely in your browser with Transformers.js

r/LocalLLaMA
Generative AI Open Source AI

Mistral recently released Voxtral-Mini-4B-Realtime, a multilingual, realtime speech-transcription model that s 13 languages and is capable of <500 ms latency. Today, we added for it to Transformers.js, enabling live captioning entirely locally in the browser on WebGPU. Hope you like it! Link to (+ source code): submitted by /u/xenovatech [link] [comments]