检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]山东科技大学经济管理学院,青岛266590 [2]山东大学经济学院,济南250100
出 处:《情报杂志》2012年第9期163-168,共6页Journal of Intelligence
基 金:山东省自然基金项目"面向纵横流数据概念漂移的衍生金融工具风险预警动态建模研究"(编号:ZR2009HQ001);教育部人文社会科学项目"基于流数据概念漂移的衍生金融工具风险预警方法研究"(编号:10YJCZH218)成果
摘 要:多变量数据流精确分类问题是当前数据挖掘与信息领域的热点和难点,引起国内外越来越多研究群体的关注,但以往的研究大多依赖于从单个流中提取特征并进行分类,没有考虑数据流内以及数据流间特征的相互依赖关系。基于此,借鉴生物信息学中基序查找的方法,提出了长期频率和逆文档频率的分类方法,该方法主要是将每个输入流都转化为符号序列来描述信号变化特征,并将符号分为长度不同的块,以便更有效地提取基序;通过计算基序的频率、长期频率与逆文档频率的权重,用以衡量不同输入多变量数据流的基序之间的时序关系,并利用了基序与时序关系实现了对多变量数据流的分类,从而确保了多变量数据流分类的准确性,仿真实验的结果也证明该方法的有效性。Gaining increasing concerns, the precise classification problems of multivariate data stream is currently the hot and difficult area of data mining and information science among research groups. But the previous studies are mostly dependent on individual flow feature ex- traction and classification, while the characteristics of interdependence between data streams are not considered. Based on these, using bioin- formatics motif-searching method, the article proposes a classification method of the term frequency and inverse document frequency. The method is mainly to translate each input stream into the sequence of symbols to describe the feature of signal change,and the symbol is di- vided into different block length in order to effectively extract the motif. By calculating the motif frequency, long term frequency and in- verse document frequency weight, the temporal motifs of motifs are measured between the different input data streams, and the relationship of motif and the temporal motif between the multivariable data is used to classify multivariable data stream, then the simulation results show that the method is effective.
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.229