基于改进标记传播算法的基因表达谱数据分析  

Analysis of gene expression profile data with an improved label propagation algorithm

在线阅读下载全文

作  者:王年[1] 葛芳[1] 王俊生[1] 唐俊[1] 

机构地区:[1]安徽大学计算智能与信号处理教育部重点实验室,安徽合肥230039

出  处:《中南大学学报(自然科学版)》2014年第7期2237-2243,共7页Journal of Central South University:Science and Technology

基  金:国家自然科学基金资助项目(61172127);安徽省自然科学基金资助项目(1208085MF93;1208085QF104);安徽大学"211工程"学术创新团队基金资助项目(KJTD007A)

摘  要:针对原始标记传播算法迭代次数过多和阈值选取的不确定性等问题,提出一种改进的标记传播算法,并将其应用于基因表达谱数据分析。首先将高维基因表达谱数据表示为权值矩阵,同时定义一个表示样本类别属性的标记序列,并将其中少量样本标记为已知;然后利用根据Gauss-Seidel迭代算法推导出的迭代公式更新标记序列,并证明标记序列的解的收敛性;最后采用正负标记的方式,根据标记序列各分量的符号差异实现数据类别的划分。通过白血病和结肠癌数据集实验,证明了本文方法的有效性。To tackle problems such as excessive iterative times and indeterminate thresholds of original label propagation algorithm, an improved label propagation method was presented with the application in the analysis of gene expression profile data. First, a weighted matrix was constructed with gene expression profile data. Meanwhile, the label sequence indicating the class information was defined, where several samples were marked as labeled data. Then, the label sequence was updated by an iterative formula which inspired from Gauss-Seidel iteration and the solution of the label sequence was proved to be converged. Finally, the clustering problem was solved using plus-minus label which was on the basis of the signs of the label sequence. Experiments on the leukemia and colon cancer data show that the proposed method is feasible and effective.

关 键 词:半监督学习 权值矩阵 标记传播 基因表达谱数据 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象