检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:肖文 胡娟 XIAO Wen;HU Juan(Department of Electrical and Information Engineering,Hohai University Wentian College,Maanshan Anhui 243031,China)
机构地区:[1]河海大学文天学院电气信息工程系,安徽马鞍山243031
出 处:《计算机应用》2018年第4期995-1000,共6页journal of Computer Applications
基 金:安徽省高校自然科学研究项目(KJ2016A623)~~
摘 要:频繁项集挖掘(FIM)是最基础的数据挖掘任务之一,被挖掘数据集的特征对FIM算法的性能有着显著影响。数据集稀疏度是体现数据集本质特征的属性之一,不同类型的FIM算法对数据集稀疏度的可扩展性有着很大的不同。针对如何量化度量数据集稀疏度及稀疏度对不同类型FIM算法性能影响等问题,首先回顾并讨论了已有的度量方法,然后提出两种新的量化度量数据集稀疏度的方法(基于事务差异度的度量方法和基于FP-Tree的度量方法)。这两种度量方法均考虑了FIM任务背景下最小支持度对数据集稀疏度的影响,反映的是事务频繁项集之间的差异度。最后通过实验验证了不同类型FIM算法对数据集稀疏度的可扩展性。实验结果表明,数据集稀疏度与最小支持度成反比,基于垂直格式的FIM算法在三类典型FIM算法中具有最佳的稀疏度可扩展性。Frequent Itemset Mining(FIM)is one of the most important data mining tasks.The characteristics of the mined datasets have a significant effect on the performance of FIM algorithms.Sparseness of datasets is one of the attributes that characterize the essential characteristics of datasets.Different types of FIM algorithms are very different in the scalability of dataset sparseness.Aiming at the measurement of sparseness of datasets and influence of sparsity on the performance of different types of FIM algorithms,the existing measurement methods were reviewed and discussed,then two methods were proposed to quantify the sparseness of the datasets:the measurement based on transaction difference and the measurement based on FP-Tree method,both of which considered the influence of the minimum support degree on the sparseness of the datasets in the background of the FIM task,and reflected the difference between the frequent itemsets of the transaction.The scalability of different types of FIM algorithms for sparseness of datasets was studied experimentally.The experimental results show that the sparseness of datasets is inversely proportional to the minimum support,and the FIM algorithm based on vertical format has the best scalability in three kinds of typical FIM algorithms.
分 类 号:TP311.5[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.12.123.254