Those of you building with voice AI, how is it going?

r/LocalLLaMA
Generative AI

​ Genuine question. I was tempted to go deeper into voice AI, not just because of the hype, but because people keep saying it's the next big evolution after chat. But at the same time, I keep hearing mixed opinions. Someone told me this that kind of stuck: Voice AI tools are not really competing on models. They're competing on how well they handle everything around the model. One feels smooth in s, the other actually works in messy real-world conversations. For context, I’ve mostly worked with text-based LLMs for a long time, and now building voice agents seriously.