检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:雷雨[1] 李曼 胡卫松 宋国杰[1] 谢昆青[1]
机构地区:[1]北京大学机器感知与智能教育部重点实验室,北京100871 [2]NEC中国研究院,北京100084
出 处:《计算机科学与探索》2015年第4期429-437,共9页Journal of Frontiers of Computer Science and Technology
摘 要:序列模式挖掘是数据挖掘领域的一个经典研究问题,目前的研究主要关注于频繁序列模式的挖掘。但是不频繁的序列模式,即"稀有序列模式(rare sequential pattern,RSP)"也可能蕴含着一些不寻常的规律,具有更高的挖掘价值。因此,给出了稀有序列模式挖掘的定义,并且提出了两种逐层挖掘稀有序列模式完全集的方法。为克服挖掘稀有序列模式全集时产生的组合爆炸问题,提出了一种高效的基于二分查找的算法来挖掘"最小稀有序列模式(minimal rare sequential pattern,MRSP)"全集,它包含了稀有序列模式全集的完整信息。通过实验验证了提出的算法可以有效地挖掘稀有序列模式。Sequential pattern mining is an important subject of data mining with a wide application range. Previous studies in this field are mostly dedicated to mining frequent sequential patterns. On the contrary, the infrequent sequential patterns, say, rare sequential patterns (RSP), may reveal the uncommon regularities, so the rare sequences may be of higher interests to analysts. This paper defines the problem of mining rare sequential patterns, and proposes two level-wise algorithms for mining the complete set of all rare sequential patterns. Moreover, in order to over- come the problem of combinatorial explosion when mining the full set of rare sequential patterns, this paper proposes a binary search based algorithm to mine only the set of minimal rare sequential patterns (MRSP), which contains the information of all rare sequential patterns. The experimental results show that the proposed algorithms serve as effec- tive solutions to the problem of mining rare sequential patterns.
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.229