检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]北方民族大学计算机科学与工程学院,银川750021
出 处:《计算机科学》2016年第12期24-29,62,共7页Computer Science
基 金:国家自然科学基金项目(61563001);北方民族大学科研基金项目(2014XYZ13)资助
摘 要:数据流是一种新型的数据模型,具有动态、无限、高维、有序、高速和变化等特性。在真实的数据流环境中,一些数据分布是随着时间改变的,即具有概念漂移特征,称为可变数据流或概念漂移数据流。因此处理数据流模型的方法需要处理时空约束和自适应调整概念变化。对概念漂移问题和概念漂移数据流分类、聚类和模式挖掘等内容进行综述。首先介绍概念漂移的类型和常用概念改变检测方法。为了解决概念漂移问题,数据流挖掘中常使用滑动窗口模型对新近事务进行处理。数据流分类常用的模型包括单分类模型和集成分类模型,常用的方法包括决策树、分类关联规则等。数据流聚类方式通常包括基于k-means的和非基于k-means的。模式挖掘可以为分类、聚类和关联规则等提供有用信息。概念漂移数据流中的模式包括频繁模式、序列模式、episode、模式树、模式图和高效用模式等。最后详细介绍其中的频繁模式挖掘算法和高效用模式挖掘算法。Data stream is a new data model proposed in recent years. It has different characteristics such as dynamic, infinite, high dimensional, orderly, high speed and evolving. In some data stream applications, the information embedded in the data is evolving over time that has the characteristics of concept drift or change. These data streams are known as evolving data streams or concept drift data streams. Therefore, the algorithms that mine data streams have space and time restrictions, and need to adapt change automatically. In this paper, we provided the survey of concept drift and classification, clustering and pattern mining on concept drift data streams. Firstly, we introduced the types and detection methods about concept drift. In order to deal with the concept drift, the sliding window model is used to mining data stream. The data stream classification model includes single model and ensemble model. The common methods include decision tree, classification association rules and so on. Data stream clustering methods can be divided into k-means based method and not. Pattern mining can provide useful patterns for classification, clustering, association rules and so on. Patterns include frequent patterns, sequential patterns, episode, sub-tree, sub-graph, high utility patterns and so on. Finally,we introduced the frequent patterns and high utility patterns in detail.
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.117