Patterns behind Chaos: Forecasting Data Movement for Efficient Large-Scale MoE LLM Inference

ArXi:2510.05497v4 Announce Type: replace-cross Large-scale Mixture of Experts (MoE) Large Language Models (LLMs) have recently become the frontier open weight models, achieving remarkable model capability similar to To understand the patterns underlying this data movement, we conduct comprehensive data-movement-centric profiling across four state-of-the-art large-scale MoE models released in 2025 (200B-1000B) using over 24,000 requests spanning diverse workloads.