Do you use LLM's with TTS and speech recognition?

r/LocalLLaMA
Generative AI NLP Open Source AI

As the title says, do you talk to your LLM using speech recognition and listen back its answers with TTS models? Last night I didn't slept much so I sit on computer and installed Fast-Kokoro for TTS and configured Koboldcpp using Whisper model and so far it seems to be great experience with SillyTavern and Gemma 4 small E4B model. I have RTX 4060 Ti with 16 GB VRAM and 32 GB of RAM and with this setup (SillyTavern + Koboldcpp + Whisper + Gemma 4-E4B + Fast Kokoro) it is almost real time, so it is relistic to use for talking with voice.