AI RESEARCH

SAGE: Training-Free Semantic Evidence Composition for Edge-Cloud Inference under Hard Uplink Budgets

arXiv CS.LG

ArXi:2604.19623v1 Announce Type: new Edge-cloud hybrid inference offloads difficult inputs to a powerful remote model, but the uplink channel imposes hard per-request constraints on the number of bits that can be transmitted. We show that selecting transmitted content based solely on attention-based importance, the standard approach in collaborative inference, is inherently limited under hard budgets. Two findings this claim. First, replacing high-importance units with low-importance but complementary ones improves server accuracy.