PromptHub: Enhancing Multi-Prompt Visual In-Context Learning with Locality-Aware Fusion, Concentration and Alignment

ArXi:2603.18891v1 Announce Type: cross Visual In-Context Learning (VICL) aims to complete vision tasks by imitating pixel nstrations. Recent work pioneered prompt fusion that combines the advantages of various nstrations, which shows a promising way to extend VICL. Unfortunately, the patch-wise fusion framework and model-agnostic supervision hinder the exploitation of informative cues, thereby limiting performance gains. To overcome this deficiency, we