基于低阶近似的多维数据流相关性分析  被引量:12

A Correlation Analysis Algorithm Based on Low-Rank Approximation for Multiple Dimension Data Streams

在线阅读下载全文

作  者:王永利[1] 徐宏炳[1] 董逸生[1] 钱江波[1] 刘学军[1] 

机构地区:[1]东南大学计算机科学与工程系

出  处:《电子学报》2006年第2期293-300,共8页Acta Electronica Sinica

基  金:江苏省2004年度研究生创新计划项目(No.xm04-36);江苏省高技术项目(No.BG2004034)

摘  要:目前存在的多数据流相关性分析方法大多只针对于单属性维数据流,无法体现多变量组成的场与场之间真实的相关性.为了在资源受限的环境下快速检测多维数据流之间的相关性,本文提出一种新颖的基于典型相关性分析(CCA)的多维数据流相关性分析算法S treamCCA,针对传统的CCA计算中的性能瓶颈,提出为样本方差阵与协差阵组成的乘积阵降维的高效低价近似方法,在保持分析精度的前提下显著地提高了计算效率.经理论分析和实验证明,S treamCCA能够在线精确地识别两条多维数据流的相关关系,可以作为通用的预报和诊断分析工具广泛应用于数据流挖掘领域.Presently existing correlation analysis method for multiple data streams were all oriented single dimensions data streams only, which could not identify the real correlation between fields built by multiple variables. To quickly detect correlations between two multiple dimension data streams under constrained resources, a novel correlation analysis algorithm based on canonical correlation analysis (CCA), called StreamCCA, is proposed. Focusing on the computational bottleneck of traditional CCA, StreamCCA introduces a low-rank approximation technique to reduce the dimensionality of product matrix resulted from sample correlation matrix and sample variance matrix, which improves computational performance efficiently on the premise of holding approximate precision. Theoretic analysis and experiments resuits on synthetic and real data sets indicate that StreamCCA can online detect correlations between multiple dimension data streams accurately. The algorithms proposed herein, are presented as generic forecasting and diagnosis tools, with a multitude of applications on data streams mining problems.

关 键 词:数据流 典型相关性分析 低阶近似 不等概采样 数据流挖掘 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象