AI RESEARCH
Compass: SLO-aware Query Planner for Compound AI Serving at Scale
arXiv CS.LG
•
ArXi:2504.16397v2 Announce Type: replace-cross The rise of compound AI serving that integrates multiple operators in a pipeline enables end-user applications such as generative AI-powered meeting companions, autonomous driving, and immersive gaming. These workloads span diverse deployment spaces, from cloud-only queries to edge-assisted ones across infrastructure tiers, often including both within an application.