一种CCA-层次聚类的基因聚类算法  

A Gene Clustering Algorithm Based on the CCA-Hierarchical Clustering

在线阅读下载全文

作  者:林倩闽 LIN Qianmin(School of Electrical Engineering and Automation,Xiamen University of Technology,Xiamen 361024,China)

机构地区:[1]厦门理工学院电气工程与自动化学院,福建厦门361024

出  处:《哈尔滨理工大学学报》2023年第5期85-90,共6页Journal of Harbin University of Science and Technology

基  金:福建省科技厅引导性项目(2019H0039);福建省中青年教师教育科研项目(JAT210341)。

摘  要:针对基因芯片技术带来的海量基因表达数据,为了充分挖掘其蕴含的生物信息和潜在的生物机制,提出一种基于CCA-层次聚类的基因聚类算法(CCA-Hc)。该算法在层次聚类的基础上引入典型相关分析,优化相似性矩阵计算方法。首先,利用典型相关分析方法结合基因的多个特征信息进行基因相关性度量,得到基因相似性矩阵。然后将该相似性矩阵作为层次聚类的邻近矩阵进行凝聚层次聚类。在Oryza sativa L.(水稻)的基因表达数据集上进行CCA-Hc聚类效果测试实验,结果表明,与采用欧式距离的传统层次聚类算法(EUC-Hc)相比,CCA-Hc的内部稳定性指标和生物功能性指标均优于EUC-Hc,具有更佳的鲁棒性和聚类准确性,更有利于去发现基因间的共表达关系。Aiming at the massive gene expression data brought by gene chip technology,in order to fully mine the biological information and potential biological mechanisms contained in it,this paper proposes a gene clustering algorithm based on CCA-hierarchical clustering(CCA-Hc).The algorithm introduces canonical correlation analysis on the basis of hierarchical clustering,and optimizes the calculation method of similarity matrix.First,the canonical correlation analysis method is used to measure the gene correlation by combining the multiple feature information of the gene,and the gene similarity matrix is obtained.Then the similarity matrix is used as the neighbor matrix of hierarchical clustering for agglomerative hierarchical clustering.The CCA-Hc clustering effect test experiment was performed on the gene expression dataset of Oryza sativa L.(rice).The results show that,compared with the traditional hierarchical clustering algorithm using Euclidean distance(EUC-Hc),CCA-Hc is superior to EUC-Hc in both internal stability index and biological functional index,and has better robustness and clustering accuracy.It is more conducive to discovering the co-expression relationship between genes.

关 键 词:基因表达数据 聚类算法 典型相关分析 层次聚类 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象