AI RESEARCH
PCSR: Pseudo-label Consistency-Guided Sample Refinement for Noisy Correspondence Learning
arXiv CS.CV
•
ArXi:2509.15623v2 Announce Type: replace Cross-modal retrieval aims to align different modalities via semantic similarity. However, existing methods often assume that image-text pairs are perfectly aligned, overlooking Noisy Correspondences in real data. These misaligned pairs misguide similarity learning and degrade retrieval performance. Previous methods often rely on coarse-grained categorizations that simply divide data into clean and noisy samples, overlooking the intrinsic diversity within noisy instances. Moreover, they typically apply uniform