OpenAI Voice Intelligence APIs: Real-Time Audio for Developers

The short version: three models, three jobs On approximately 8 May 2026, OpenAI expanded its Realtime API with three dedicated audio intelligence models. These are not incremental patches to the existing gpt-4o-realtime-preview offering - they supersede it with purpose-built models for distinct voice workloads. The trio covers conversational reasoning, live cross-lingual translation, and streaming transcription, and all three share the same WebSocket-based session architecture.