← 返回列表

arXiv 提交日期: 2026-04-13

📄 Abstract - Distributionally Robust K-Means Clustering

K-means clustering is a workhorse of unsupervised learning, but it is notoriously brittle to outliers, distribution shifts, and limited sample sizes. Viewing k-means as Lloyd--Max quantization of the empirical distribution, we develop a distributionally robust variant that protects against such pathologies. We posit that the unknown population distribution lies within a Wasserstein-2 ball around the empirical distribution. In this setting, one seeks cluster centers that minimize the worst-case expected squared distance over this ambiguity set, leading to a minimax formulation. A tractable dual yields a soft-clustering scheme that replaces hard assignments with smoothly weighted ones. We propose an efficient block coordinate descent algorithm with provable monotonic decrease and local linear convergence. Experiments on standard benchmarks and large-scale synthetic data demonstrate substantial gains in outlier detection and robustness to noise.

顶级标签: machine learning model training theory

分布鲁棒的K均值聚类 / Distributionally Robust K-Means Clustering

1️⃣ 一句话总结

这篇论文提出了一种新的、更稳健的K均值聚类方法，通过考虑最坏情况下的数据分布变化，有效提升了算法对异常值、噪声和样本量不足的抵抗力，并提供了高效的求解算法。

👋 没兴趣 ☆ 感兴趣 📌 待读

打开原文 PDF

源自 arXiv: 2604.11118

← 返回列表

菜单

AI 帮我研读全文

1️⃣ 一句话总结

密码管理

设置密码

修改密码

移除密码

菜单

AI 帮我研读全文

1️⃣ 一句话总结

获取最新论文摘要