检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]天津工业大学计算机科学与软件学院,天津300387
出 处:《计算机工程与科学》2016年第7期1510-1516,共7页Computer Engineering & Science
摘 要:近年来,数据流分类问题已经逐渐成为数据挖掘领域的一个研究热点,然而传统的数据流分类算法大多只能处理数据项已知并且为精确值的数据流,无法有效地应用于现实应用中普遍存在的不确定数据流。为建立适应数据不确定性的分类模型,提高不确定数据流分类准确率,提出一种针对不确定数据流的集成分类算法,该算法将不确定数据用区间及其概率分布函数表示,用C4.5决策树分类方法和朴素贝叶斯分类方法训练基分类器,在合理处理数据流中不确定性的同时,还能有效解决数据流中隐含的概念漂移问题。实验结果表明,所提算法在处理不确定数据流的分类时具有较好的鲁棒性,并且具有较高的分类准确率。Data stream classification has gradually become a hot topic in the field of data mining in recent years. Most traditional data stream classification algorithms work on data whose values are known and precise, however, they cannot be effectively applied to uncertain data streams which are ubiquitous in practical applications. To establish an appropriate classification model for uncertain data and improve the accuracy of uncertain data stream classification, we propose an ensemble classification algorithm for uncertain data streams, which denotes the uncertain data with an interval and a probability distribution function. We train base classifiers with the C4.5 decision tree classification method and the Naive Bayesian classification method. The proposed algorithm cannot only reasonably process the uncertainty in data streams, but also adapt to the concept drift in an effective way. Experimental results demonstrate the effectiveness and robustness of the proposed algorithm.
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.145.82.96