How I Crashed My AI Agent Fleet in 30 Minutes (And Fixed It): VRAM Management on Apple Silicon

Dev.to AI
Generative AI AI Hardware

How I Crashed My AI Agent Fleet in 30 Minutes (And Fixed It): VRAM Management on Apple Silicon I learned this the hard way at 5 AM on a Thursday. I'm running an autonomous AI agent system on a MacBook Pro M3 Pro with 36GB unified memory. The setup: multiple local LLMs via Ollama, orchestrated by a main agent that delegates tasks to subagents running on different models. Think of it as a small company where the CEO (main agent) assigns work to specialists (local models). It was working beautifully. Then it wasn't.