LoC-Path: Learning to Compress for Pathology Multimodal Large Language Models

ArXi:2512.05391v3 Announce Type: replace Whole Slide Image (WSI) MLLMs are difficult to build and deploy because gigapixel slides induce thousands of visual tokens, while only a small fraction of regions is diagnostically relevant. Existing slide-level pathology MLLMs typically combine heavy slide-level encoders with long visual prefixes, making end-to-end slide-level development and deployment expensive under limited computational resources.