用于癌症分类的随机子空间半监督维数约减(英文)  

Random Subspace-Based Semi-Supervised Dimensionality Reduction for Cancer Classification

在线阅读下载全文

作  者:文贵华[1] 蔡先发[1,2,3] 韦佳[1] 

机构地区:[1]华南理工大学计算机科学与工程学院,广东广州510006 [2]广东药学院医药信息工程学院,广东广州510006 [3]深圳市高性能数据挖掘重点实验室,广东深圳518055

出  处:《华南理工大学学报(自然科学版)》2013年第7期137-144,共8页Journal of South China University of Technology(Natural Science Edition)

基  金:Supported by National Natural Science Foundation of China(61273363,61070090,61003174,60973083)~~

摘  要:精确的癌症分类对于癌症的成功诊断和治疗是必不可少的.半监督维数约减算法在干净的数据集上表现得很好,然而当面临噪声时,当前的大部分算法所构造的邻域结构是拓扑不稳定的.为了克服这一问题,文中提出了一种基于随机子空间的半监督维数约减算法(RSSSDR),将随机子空间与半监督维数约减算法结合起来.在数据集的不同随机子空间上,该算法首先设计多个不同的子图,然后将这些子图联合起来构成一个混合图并在其上进行维数约减.该算法通过最小化局部重构误差来确定邻域图的边权值,在保持癌症数据集局部结构的同时能够保持其全局结构.在公共癌症数据集上的实验结果表明,RSSSDR算法具有较高的分类准确率和较好的参数鲁棒性.Precise cancer classification is essential to the successful diagnosis and treatment of cancers. Al- though semi-supervised dimensionality reduction approaches perform very well on clean data sets, the topology of the neighborhood constructed with most existing approaches is unstable in the presence of noise. In order to solve this problem, a novel random subspace-based semi-supervised dimensionality reduction algorithm marked as RSSSDR, which combines the random subspace with the semi-supervised dimensionality reduction, is pro- posed. In this algorithm, first, multiple diverse graphs are designed in different random subspaces of data sets and are then combined to form a mixture graph on which dimensionality reduction is performed. Subsequently, the edge weights of neighborhood graph are determined through minimizing the local reconstruction error, such that the global geometric structure of data can be preserved without changing the local geometric structure. Ex- perimental results on public cancer data sets demonstrate that the proposed RSSSDR algorithm is of high classifi- cation accuracy and strong robustness.

关 键 词:半监督学习 随机子空间 癌症分类 维数约减 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象