菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-02-17
📄 Abstract - Doubly Stochastic Mean-Shift Clustering

Standard Mean-Shift algorithms are notoriously sensitive to the bandwidth hyperparameter, particularly in data-scarce regimes where fixed-scale density estimation leads to fragmentation and spurious modes. In this paper, we propose Doubly Stochastic Mean-Shift (DSMS), a novel extension that introduces randomness not only in the trajectory updates but also in the kernel bandwidth itself. By drawing both the data samples and the radius from a continuous uniform distribution at each iteration, DSMS effectively performs a better exploration of the density landscape. We show that this randomized bandwidth policy acts as an implicit regularization mechanism, and provide convergence theoretical results. Comparative experiments on synthetic Gaussian mixtures reveal that DSMS significantly outperforms standard and stochastic Mean-Shift baselines, exhibiting remarkable stability and preventing over-segmentation in sparse clustering scenarios without other performance degradation.

顶级标签: machine learning model training theory
详细标签: clustering mean-shift stochastic optimization bandwidth selection density estimation 或 搜索:

双重随机均值漂移聚类 / Doubly Stochastic Mean-Shift Clustering


1️⃣ 一句话总结

本文提出了一种新的双重随机均值漂移聚类方法,通过在每次迭代中随机选择数据点和核函数带宽,有效解决了传统算法对带宽参数敏感、在数据稀疏时容易产生错误分割的问题,从而更稳定地探索数据密度分布并提升聚类效果。

源自 arXiv: 2602.15393