一种基于图论与最大路径的关联规则挖掘算法
作者:
作者单位:

作者简介:

涂晓斌(1967—),男,教授,研究方向为工程制图。E-mail:769283941@qq.com。

通讯作者:

中图分类号:

TP311

基金项目:

国家自然科学基金项目(11761033);江西省科技厅科技项目(20192BBHL80004)


An Algorithm for Mining Association Rules Based on Graph Theory and Maximum Path
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    关联规则的挖掘目标是发现数据项集之间的关联关系或相关关系,是数据挖掘中的一个重要课题。 对于超大数据集,传统算法效率较低,对其加以改进,给出了一种基于图论与最大路径的关联规则挖掘算法。 该算法将事务集构造成布尔矩阵,经矩阵清理后,将其转换为图的形式,根据关联规则图生成邻接矩阵。 当取步长为 k 且 k>2 时,按行从第一个非 0 元素开始遍历, 寻找最大权值路径,此时连接所得元素的行列索引即频繁 k+2 项集。 实验结果表明该算法减少了对数据集的扫描次数,针对大数据集,相较于传统的 Apriori 算法能够显著缩短时间,大大提高效率。

    Abstract:

    The goal of association rule mining is to discover the association or correlation between data item sets, which is an important topic in data mining. For very large data sets, traditional algorithms are inefficient. This paper improves them and gives an association rule mining algorithm based on graph theory and maximum path. The algorithm first constructs the transaction sets into a Boolean matrix. After the matrix is cleaned, the transaction set is converted into the form of a graph, and then an adjacency matrix is generated according to the association rule graph. When the step size is k and k>2, traverse from the first non-zero element by line to find the path with the largest weight, and the row and column index of the connected elements is the frequent k+2 item set. Experimental results show that the algorithm firstly reduces the number of scans of the data set. Secondly, for large data sets, compared with the traditional Apriori algorithm, it can significantly shorten the time and greatly improve the efficiency.

    参考文献
    相似文献
    引证文献
引用本文

涂晓斌,郭力,刘晨宁,周婷,左黎明.一种基于图论与最大路径的关联规则挖掘算法[J].华东交通大学学报,2021,38(3):137-141.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2021-08-02
  • 出版日期: