AI RESEARCH
Improving clustering quality evaluation in noisy Gaussian mixtures
arXiv CS.LG
•
ArXi:2503.00379v3 Announce Type: replace Clustering is a well-established technique in machine learning and data analysis, widely used across various domains. Cluster validity indices, such as the Average Silhouette Width, Calinski-Harabasz, and Davies-Bouldin indices, play a crucial role in assessing clustering quality when external ground truth labels are unavailable. However, these measures can be affected by different degrees of feature relevance, potentially leading to unreliable evaluations in high-dimensional or noisy data sets.