检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]杭州商学院计算机科学系
出 处:《计算机研究与发展》2003年第10期1488-1498,共11页Journal of Computer Research and Development
基 金:国家"八六三"高技术研究发展计划 ( 2 0 0 2AA12 10 64 );浙江省自然科学基金 ( 60 2 14 0 );浙江省教育厅科技计划项目基金(2 0 0 2 0 63 5 )
摘 要:频繁模式挖掘是最基本的数据挖掘问题 ,由于内在复杂性 ,提高挖掘算法性能一直是个难题 HP是通过数据库混合投影来挖掘频繁模式完全集的全新算法 HP混合投影思想是 :任意数据集都不能简单地归入某个单一特性类别 ,挖掘过程应根据局部数据子集的特性变化动态地调整频繁模式树构造策略、事务子集表示形式、投影方法 HP提出基于树表示的虚拟投影与基于数组表示的非过滤投影 ,较好地解决了提高时间效率与节省内存空间的矛盾 实验表明 ,HP时间效率比Apriori,FP Growth和H Mine高出 1~ 3个数量级 。Frequent pattern mining is a fundamental data mining problem for which algorithms still suffer from inefficiencies because of the inherent complexities The new algorithm HP presented in this paper discovers frequent patterns by employing hybrid projections of datasets to grow a frequent pattern tree The basic idea is that any dataset cannot be simply classified as dense or sparse one, so the mining algorithm should dynamically adjust its frequent pattern tree search strategies, representations of transaction subsets, projection methods according to features of the local subsets Also proposed in HP are the tree based pseudo projection and array based unfiltered projection that resolves the contradiction between time complexity and space complexity Comparative experiments show that HP is one to three orders of magnitude more efficient than Apriori, FP Growth and H Mine, but also more scalable than other algorithms
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222