菜单

关于 🐙 GitHub
arXiv 提交日期: 2026-02-11
📄 Abstract - Chamfer-Linkage for Hierarchical Agglomerative Clustering

Hierarchical Agglomerative Clustering (HAC) is a widely-used clustering method based on repeatedly merging the closest pair of clusters, where inter-cluster distances are determined by a linkage function. Unlike many clustering methods, HAC does not optimize a single explicit global objective; clustering quality is therefore primarily evaluated empirically, and the choice of linkage function plays a crucial role in practice. However, popular classical linkages, such as single-linkage, average-linkage and Ward's method show high variability across real-world datasets and do not consistently produce high-quality clusterings in practice. In this paper, we propose \emph{Chamfer-linkage}, a novel linkage function that measures the distance between clusters using the Chamfer distance, a popular notion of distance between point-clouds in machine learning and computer vision. We argue that Chamfer-linkage satisfies desirable concept representation properties that other popular measures struggle to satisfy. Theoretically, we show that Chamfer-linkage HAC can be implemented in $O(n^2)$ time, matching the efficiency of classical linkage functions. Experimentally, we find that Chamfer-linkage consistently yields higher-quality clusterings than classical linkages such as average-linkage and Ward's method across a diverse collection of datasets. Our results establish Chamfer-linkage as a practical drop-in replacement for classical linkage functions, broadening the toolkit for hierarchical clustering in both theory and practice.

顶级标签: machine learning model evaluation data
详细标签: hierarchical clustering linkage function chamfer distance agglomerative clustering clustering evaluation 或 搜索:

用于层次凝聚聚类的倒角距离链接方法 / Chamfer-Linkage for Hierarchical Agglomerative Clustering


1️⃣ 一句话总结

这篇论文提出了一种名为‘倒角距离链接’的新方法,用于改进层次聚类,它通过一种更稳健的距离计算方式,在不同类型的数据集上都能比传统方法更稳定地生成高质量的聚类结果。

源自 arXiv: 2602.10444