一种有效的完全加权正负关联模式挖掘算法AWAPM_SPRMI  被引量:1

Efficient algorithm AWAPM_SPRMII for mining all-weighted positive and negative association patterns

在线阅读下载全文

作  者:高亮[1] 夏冰[2] 黄名选[3] 

机构地区:[1]中原工学院软件学院,郑州450007 [2]中原工学院计算机学院,郑州450007 [3]广西财经学院信息与统计学院,南宁530003

出  处:《计算机应用研究》2015年第6期1642-1648,共7页Application Research of Computers

基  金:国家自然科学基金资助项目(61262028;61363037);河南省科技攻关资助项目(132102310284);广西自然科学基金资助项目(2012GXNSFAA053235);广西教育厅科研资助项目(201203YB225;2013LX236;KY2015YB483);广西高校优秀人才计划资助项目(桂教人[2011]40号);广西财经学院数量经济学创新团队资助项目(2014CX01)

摘  要:完全加权正负关联模式在文本挖掘、信息检索等方面具有重要的理论和应用价值。针对现有挖掘算法的不足,构建完全加权正负关联模式评价框架SPRMII(support-probability ratio-mutual information-interest),提出完全加权项集双兴趣度阈值剪枝策略,然后基于该剪枝策略提出一种新的基于SPRMII框架的完全加权正负关联模式挖掘算法AWAPM_SPRMII(all-weighted association patterns mining based on SPRMII)。该算法克服了传统挖掘算法缺陷并采用新剪枝方法从完全加权数据库中挖掘有趣的频繁项集和负项集,通过项集权重维数比的简单计算和SPRMII评价框架,从这些项集中挖掘有效的完全加权正负关联规则。理论分析和实验表明,该算法有效,具有良好的扩展性,与现有经典挖掘算法比较,获得了良好的挖掘性能。All-weighted association patterns mining has important theoretical and application value in the text mining, information retrieval and the like. Aiming at the issues of the existing mining algorithms ,this paper introduced an evaluation framework SPRMII (support-probability ratio-mutual information-interest ) for all-weighted association patterns and the dual interest threshold pruning strategy firstly. And then it proposed a novel mining algorithm AWAPM SPRMII ( all-weighted association patterns mining based on SPRMII) based on SPRMII for mining all-weighted positive and negative association patterns in data- bases. The algorithm could not only overcome the defects of the traditional association rules mlning and avoid ineffective and uninteresting association patterns generated, but also efficiently mine interesting frequent itemsets and negative itermsets from massive all-weighted databases and further discover all-weighted positive and negative association rules only with easy computa- tion and comparison of the ratio of weight to dimension from the itemset. As shown in the theoretical analysis and the experi- mental results on real-world text dataset, in contrast with the traditional mining methods, this approach can work more efficiently and effectively discover all-weighted positive and negative association patterns.

关 键 词:数据挖掘 正负关联模式 完全加权关联规则 频繁项集 

分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象