AI RESEARCH

Language-Assisted Image Clustering Guided by Discriminative Relational Signals and Adaptive Semantic Centers

arXiv CS.LG

ArXi:2603.24275v1 Announce Type: new Language-Assisted Image Clustering (LAIC) augments the input images with additional texts with the help of vision-language models (VLMs) to promote clustering performance. Despite recent progress, existing LAIC methods often overlook two issues: (i) textual features constructed for each image are highly similar, leading to weak inter-class discriminability; (ii) the clustering step is restricted to pre-built image-text alignments, limiting the potential for better utilization of the text modality.