AI RESEARCH
Rocks, Pebbles and Sand: Modality-aware Scheduling for Multimodal Large Language Model Inference
arXiv CS.AI
•
ArXi:2603.26498v1 Announce Type: cross Multimodal Large Language Models (MLLMs) power platforms like ChatGPT, Gemini, and Copilot, enabling richer interactions with text, images, and videos. These heterogeneous workloads