基于自然和加权共享最近邻的密度峰值聚类算法
DOI:
作者:
作者单位:

华东交通大学理学院

作者简介:

通讯作者:

中图分类号:

基金项目:

江西省自然科学基金资助项目(20224BAB211005)旗传递区组设计及其自同构群的分类研究


Density Peak Clustering Algorithm based on Natural and Weighted Shared Nearest Neighbors
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    密度峰值聚类(DPC)作为一种高效且不需要迭代的聚类算法得到广泛应用。研究发现,该算法使用在非球形簇和密度不均匀的聚类时,DPC很难选择正确的簇中心,且该算法受截断距离参数的影响较大。【目的】为了解决DPC算法在密度分布不均匀的数据集上效果不佳的问题,【方法】提出了一种基于自然和加权共享最近邻的密度峰值聚类算法。该算法首先引入自然最近邻计算加权值,再根据一阶和二阶共享最近邻的定义重新计算数据对象之间的相似度,然后通过融合共享最近邻相似度的定义和自然最近邻权重值计算相对密度和相对距离,最后还设计了新的分类型簇中心扩散分配策略。【结果】在8个不同类型的数据集上的实验结果表明,本文所提出算法的聚类性能要明显优于其余4个对比算法。【结论】该方法在密度不均匀的数据集上对簇中心也有较好的识别效果,很好地解决了上述问题。

    Abstract:

    Density Peak Clustering (DPC) has been widely used as an efficient and non-iterative clustering algorithm. However, studies have found that DPC struggles to select correct cluster centers, especially in datasets with non-spherical clusters and non-uniform density. Moreover, the algorithm is heavily influenced by the truncation distance parameter. 【Objective】In order to address the issue of poor performance of DPC on datasets with uneven density distributions, 【Method】we propose a density peak clustering algorithm based on natural and weighted shared nearest neighbors. This algorithm first introduces natural nearest neighbor computations to calculate weights, then redefines the similarity between data objects based on the definitions of first-order and second-order shared nearest neighbors. Subsequently, by fusing the definitions of shared nearest neighbor similarity and natural nearest neighbor weights, relative density and relative distance are calculated. Finally, a novel strategy for distributing cluster centers is designed. 【Result】Experimental results on sixs different types of datasets demonstrate that the proposed algorithm outperforms four other comparative algorithms significantly in terms of clustering performance. 【Conclusion】The method achieves better cluster center identification on datasets with non-uniform density, effectively addressing the aforementioned issues.

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-03-19
  • 最后修改日期:2024-04-20
  • 录用日期:2024-05-06
  • 在线发布日期: 2024-06-14
  • 出版日期: