AI RESEARCH
Distributionally Robust K-Means Clustering
arXiv CS.LG
•
ArXi:2604.11118v1 Announce Type: new K-means clustering is a workhorse of unsupervised learning, but it is notoriously brittle to outliers, distribution shifts, and limited sample sizes. Viewing k-means as Lloyd--Max quantization of the empirical distribution, we develop a distributionally robust variant that protects against such pathologies. We posit that the unknown population distribution lies within a Wasserstein-2 ball around the empirical distribution.