AI RESEARCH
Towards Understanding, Analyzing, and Optimizing Agentic AI Execution: A CPU-Centric Perspective
arXiv CS.AI
•
ArXi:2511.00739v3 Announce Type: replace Agentic AI serving converts monolithic LLM-based inference to autonomous problem-solvers that can plan, call tools, perform reasoning, and adapt on the fly. Due to diverse task execution need, such serving heavily rely on heterogeneous CPU-GPU systems with majority of the external tools responsible for agentic capability, either run on or are orchestrated by the CPU. Towards having a deeper understanding of its role, this paper aims to characterize and analyze the system bottlenecks.