AI RESEARCH
Architecture advice: Real-time pipeline for YouTube Audio -> Whisper -> LLM -> SSE (Sub-10s latency) [D]
r/MachineLearning
•
Currently, my flow is a slow waterfall: Download full audio -> Whisper -> LLM -> Return results. For a 30-minute video, the user waits forever.