一种使用概念近似度约简的序列模式挖掘方法  

Sequential Patterns Mining Using Concept Reduction for Similitude Degree

在线阅读下载全文

作  者:胡学钢[1] 张晶[1] 张玉红[1] 谭喆[1] 

机构地区:[1]合肥工业大学计算机与信息学院,安徽合肥230009

出  处:《烟台大学学报(自然科学与工程版)》2009年第3期202-205,共4页Journal of Yantai University(Natural Science and Engineering Edition)

基  金:安徽省自然科学基金资助项目(050420207)

摘  要:传统的序列模式挖掘算法虽然能够挖掘所有的频繁序列,但在挖掘海量数据时可能因结果规模过于庞大而无法理解.基于概念格的序列模式挖掘有效地减少了中间序列的生成数量,在时间性能上具有一定的优越性,而概念格的结构特点也为自身的约简提供了便利.本文提出了近似概念的定义,首先对交易数据库建格,然后约简满足近似条件的概念,减少了频繁1-序列的数量,进而减少了总的频繁序列的数量.实验表明,在允许一定误差的情况下该方法提高了挖掘结果的可理解性和挖掘效率.Most of the algorithms for sequential pattern mining can find out all the frequent sequences, however, when the data is huge, the number of the mining results may be too large to be understood. The algorithm for sequence patterns based on the concept lattice can reduce the number of middle results effectively, and therefore is superior to other methods in time performance. And the structure of concept lattice is suitable to reduction. In this paper, the approximation concept is proposed. In the method, concept lattice is constructed based on the business database first, and then the concepts obeying the law of approximation defined is reduced. As a result, the number of frequent 1-sequences and the number of all the frequent sequences will decrease. The experimental results demonstrate that the present approach outperforms the others much in the efficiency and understandability within error.

关 键 词:数据挖掘 频繁序列 概念格 概念约简 

分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象