检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]宁波大红鹰学院信息工程学院,浙江宁波315175
出 处:《计算机应用》2015年第7期1921-1926,共6页journal of Computer Applications
基 金:国家自然科学基金资助项目(61370200);宁波市自然科学基金资助项目(2013A610115;2014A610073);浙江省教育厅一般科研项目(Y201432717);宁波大红鹰学院大宗商品专项课题(1320133004)
摘 要:为提高不确定数据频繁模式(FP)挖掘算法的时空效率,提出了基于最大概率的不确定频繁模式挖掘(UFPM-MP)算法。首先,利用事务项集中的最大概率值预估期望支持数;然后,使用该期望支持数与最小期望支持数阈值进行比较,以确定某一项集是否为候选频繁项集,并对候选项集建立子树以递归挖掘频繁模式。实验中,UFPMMP算法与AT-Mine算法进行了对比,并在6个典型的数据集上进行实验验证。实验结果表明,UFPM-MP算法的时空效率得到了提高,稀疏数据集上提高约30%,稠密数据集上的效率提高更为明显(约3~4倍)。预估期望支持数的策略有效地减少了子树和头表项的数量,从而提高了算法的时空效率;且最小期望支持数越小,或需要挖掘的频繁模式越多的时候,算法的时间效率提高越多。To improve the time and space efficiency of Frequent Pattern (FP) mining algorithm over uncertain dataset, the Uncertain Frequent Pattern Mining based on Max Probability (UFPM-MP) algorithm was proposed. First, the expected support number was estimated using maximum probability of the transaction itemset. Second, by comparing this expected support number to the minimum expected support number threshold, the candidate frequent itemsets were identified. Finally, the corresponding sub-trees were built for recursively mining frequent patterns. The UFPM-MP algorithm was tested on 6 classical datasets against the state-of-the-art algorithm AT (Array based tail node Tree structure)-Mine with positive results ( about 30% improvement for sparse datasets, and 3 - 4 times more efficient for dense datasets). The expected support number estimation strategy effectively reduces the number of sub-trees and items of header table, and improves the algorithm's time and space efficiency; and when the minimum expected support threshold is low or there are lots of potential frequent patterns, time efficiency of the proposed algorithm performs more remarkably.
分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15