基于特征漂移的数据流集成分类方法  被引量:5

Ensemble classification based on feature drifting in data streams

在线阅读下载全文

作  者:张育培[1] 刘树慧[1] 

机构地区:[1]郑州大学信息工程学院,河南郑州450052

出  处:《计算机工程与科学》2014年第5期977-985,共9页Computer Engineering & Science

摘  要:为构建更加有效的隐含概念漂移数据流分类器,依据不同数据特征对分类关键程度不同的理论,提出基于特征漂移的数据流集成分类方法(ECFD)。首先,给出了特征漂移的概念及其与概念漂移的关系;然后,利用互信息理论提出一种适合数据流的无监督特征选择技术(UFF),从而析取关键特征子集以检测特征漂移;最后,选用具有概念漂移处理能力的基础分类算法,在关键特征子集上建立异构集成分类器,该方法展示了一种隐含概念漂移高维数据流分类的新思路。大量实验结果显示,尤其在高维数据流中,该方法在精度、运行速度及可扩展性方面都有较好的表现。In order to construct an effective classifier for data streams with concept drifting, according to the theory that different data feature has different critical degree for classification,a method of Ensem- ble Classifier for Feature Drifting in data streams (ECFD) is proposed. Firstly, the definite of feature drifting and the relationship between feature drifting and concept drifting is given. Secondly, mutual in- formation theory is used to propose an Unsupervised Feature Filter (UFF) technique,so that critical fea- ture subsets are extracted to detect feature drifting. Finally, the basic classified algorithms with the ca- pability of handling concept drifting is chosen to construct heterogeneous ensemble classifier on the basis of critical feature subsets. This method exhibits a new idea of way to high-dimensional data streams with hidden concept drifting. Experimental results show that the method has strong appearance in accuracy, speed and scalability, especially for high-dimensional data streams.

关 键 词:特征选择 特征漂移 概念漂移 数据流 互信息 集成分类器 

分 类 号:TP274[自动化与计算机技术—检测技术与自动化装置]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象