AI RESEARCH

MoLoRA: Composable Specialization via Per-Token Adapter Routing

arXiv CS.AI

ArXi:2603.15965v1 Announce Type: cross Multi-adapter serving systems route entire sequences to a single adapter, forcing a choice when requests span multiple domains. This assumption fails in two important settings: (1) multimodal generation, where text and image tokens require different adapters within the same sequence, and (2) mixed-capability requests like "write code to solve this equation," which need expertise from multiple specialized adapters. We