Seeing is Believing: Robust Vision-Guided Cross-Modal Prompt Learning under Label Noise

ArXi:2604.09532v1 Announce Type: cross Prompt learning is a parameter-efficient approach for vision-language models, yet its robustness under label noise is less investigated. Visual content contains richer and reliable semantic information, which remains robust under label noise. However, the prompt itself is highly susceptible to label noise. Motivated by this intuition, we propose VisPrompt, a lightweight and robust vision-guided prompt learning framework for noisy-label settings.