基于随机矩阵理论的高维数据特征选择方法  被引量:4

Feature selection method of high-dimensional data based on random matrix theory

在线阅读下载全文

作  者:王妍[1] 杨钧 孙凌峰 李玉诺 宋宝燕[1] 

机构地区:[1]辽宁大学信息学院,沈阳110036 [2]荣科科技股份有限公司智慧城市开发部,沈阳110027

出  处:《计算机应用》2017年第12期3467-3471,共5页journal of Computer Applications

基  金:国家自然科学基金资助项目(61472169;61472072;61528202;61501105);国家973计划前期研究专项(2014CB360509);辽宁省教育厅科学研究一般项目(L2015204)~~

摘  要:传统特征选择方法多是通过相关度量来去除冗余特征,并没有考虑到高维相关矩阵中会存在大量的噪声,严重地影响特征选择结果。为解决此问题,提出基于随机矩阵理论(RMT)的特征选择方法。首先,将相关矩阵中符合随机矩阵预测的奇异值去除,从而得到去噪后的相关矩阵和选择特征的数量;然后,对去噪后的相关矩阵进行奇异值分解,通过分解矩阵获得特征与类的相关性;最后,根据特征与类的相关性和特征之间冗余性完成特征选择。此外,还提出一种特征选择优化方法,通过依次将每一个特征设为随机变量,比较其奇异值向量与原始奇异值向量的差异来进一步优化结果。分类实验结果表明所提方法能够有效提高分类准确率,减小训练数据规模。The traditional feature selection methods always remove redundant features by using correlation measures, and it is not considered that there is a large amount of noise in a high-dimensional correlation matrix, which seriously affects the feature selection result. In order to solve the problem, a feature selection method based on Random Matrix Theory (RMT) was proposed. Firstly, the singular values of a correlation matrix which met the random matrix prediction were removed, thereby the denoised correlation matrix and the number of selected features were obtained. Then, the singular value decomposition was performed on the denoised correlation matrix, and the correlation between feature and class was obtained by decomposed matrix. Finally, the feature selection was accomplished according to the correlation between feature and class and the redundancy between features. In addition, a feature selection optimization method was proposed, which furtherly optimize the result by comparing the difference between singular value vector and original singular value vector and setting each feature as a random variable in turn. The classification experimental results show that the proposed method can effectively improve the classification accuracy and reduce the trainint,, data scale.

关 键 词:随机矩阵 特征选择 去噪 奇异值 相关矩阵 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象