DualTurn: Learning Turn-Taking from Dual-Channel Generative Speech Pretraining

ArXi:2603.08216v1 Announce Type: cross Speech-to-speech models handle turn-taking naturally but offer limited for tool-calling or complex reasoning, while production ASR-LLM-TTS voice pipelines offer these capabilities but rely on silence timeouts, which lead to unnatural turn-taking. We present DualTurn, which narrows this gap through generative pre