AI RESEARCH
SIFT-VTON: Geometric Correspondence Supervision on Cross-Attention for Virtual Try-On
arXiv CS.CV
•
ArXi:2605.01296v1 Announce Type: new Diffusion-based virtual try-on methods achieve photorealistic synthesis through cross-attention mechanisms that transfer garment features to target body regions. However, these approaches rely on implicit learning of spatial correspondences, struggling to preserve fine details such as text and illustrations. We propose a novel approach, which we call SIFT-VTON, that utilizes SIFT keypoint matching to provide explicit geometric guidance for diffusion-based virtual try-on.