基于交叠数据窗距离测度概念漂移检测新方法  被引量:5

Concept drift detection based on distance measurement of overlapped data windows

在线阅读下载全文

作  者:刘茂[1] 张东波[1] 赵圆圆[1] 

机构地区:[1]湘潭大学信息工程学院,湖南湘潭411105

出  处:《计算机应用》2014年第2期542-545,549,共5页journal of Computer Applications

基  金:国家自然科学基金资助项目(60835004);湖南省教育厅科研项目(10B109);湖南省重点学科建设项目

摘  要:针对数据流中的概念漂移检测存在错误检测、延迟检测等问题,提出了一种基于交叠数据窗距离测度的在线概念漂移检测方法。通过将数据流划分成大小相等且交叠的数据窗并计算相邻交叠数据窗异构欧氏距离,同时利用近邻原则判别数据窗中样本不一致程度,从而实现分布差异性评价和漂移的检测。为评价该方法的有效性,在具有不同漂移严重程度和漂移速度的公开数据集上进行了实验,实验结果表明:该方法能够准确快速地检测到不同类型的概念漂移且能够找出概念漂移发生的具体位置。To solve the false detection and detection delay of concept drift for data stream, a new online concept drift detection method based on the distance measurement of overlapped data windows was proposed in this paper. By dividing the data stream into overlapped data windows and computing the heterogeneous Euclidean distance of neighboring windows, and measuring the inconsistency of the data windows through the nearest neighbor principle, the authors could achieve the evaluation of distribution diversity and the detection of concept drift. To evaluate the effectiveness of the proposed method, experiments were made on some public data sets with different drift severity and drift speed. The experimental results show that the proposed method can detect different types of concept drift quickly and accurately and can figure out the locations where concept drift appeared.

关 键 词:概念漂移 数据流 异构欧氏距离 交叠数据窗 

分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象