检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:宋世杰[1] 胡华平[1] 周嘉伟[1] 金士尧[1]
机构地区:[1]国防科学技术大学计算机学院,长沙410073
出 处:《计算机研究与发展》2006年第1期68-74,共7页Journal of Computer Research and Development
基 金:国家自然科学基金项目(60573136);国家"八六三"高技术研究发展计划基金项目(2003AA142010)
摘 要:在重新定义序列模式的长度、增加了序列模式的挖掘粒度的基础上,提出一种基于大项集重用的序列模式挖掘算法HVSM·该算法采用垂直位图法表示数据库,先横向扩展项集,将挖掘出的所有大项集组成一大序列项集,再纵向扩展序列,将每个一大序列项集作为“集成块”,在挖掘k大序列时重用大项集·并以兄弟节点为种子生成候选大序列,利用1st-TID对支持度进行计数·实验表明,对于大规模事务数据库,该算法有效地提高了挖掘效率·A first-horizontally-last-vertically scanning database sequential pattern mining algorithm (HVSM) based on large-itemset reuse is presented in this paper. The algorithm redefines the length of sequential pattern, which increases the granularity of mining sequential pattern. While considering a database as a vertical bitmap, the algorithm first extends the itemset horizontally, and digs out all the large-itemsets which are called one-large-sequence itemset. Then the algorithm extends the sequence vertically, and takes each one-large-sequence itemset as a "container" for mining k-large-sequence, and generates candidate large sequence by means of taking brother-nodes as child-nodes, and counts the support by recording the 1st-TID. The experiments show that the HVSM can find out frequent sequences faster than the SPAM algorithm for mining the medium-sized and large transaction databases.
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.24