基于差异共表达邻接网络的癌症致病基因预测算法  

Cancer Pathogenic Gene Prediction Based on Differential Co-expression Adjacent Network

在线阅读下载全文

作  者:李志杰 廖旭红 李青蓝 刘丽 LI Zhijie;LIAO Xuhong;LI Qinglan;LIU Li(School of Information Science and Engineering,Hunan Institute of Science and Technology,Yueyang,Hunan 414006,China;Medical College,University of Pennsylvania,Philadelphia 19019,USA;Medical College,Virginia Commonwealth University,Richmond 23284,USA)

机构地区:[1]湖南理工学院信息科学与工程学院,湖南岳阳414006 [2]宾夕法尼亚大学医学院,费城190193 [3]弗吉尼亚联邦大学医学院,里士满23284

出  处:《计算机科学》2025年第5期161-170,共10页Computer Science

基  金:国家自然科学基金(62072475,61672391);湖南省自然科学基金(2019JJ40111)。

摘  要:癌症是人类健康的第一杀手。随着测序技术的快速发展,积累了海量的癌症基因表达数据,利用计算方法进行致病基因预测成为癌症研究领域新的热点。然而,目前致病基因预测大多基于基因相互作用网络等,很少考虑网络局部连接与基因差异表达间的潜在联系。针对上述问题,首先利用患病前后的基因表达差异数据,通过互信息计算基因间的相关性并构建邻接网络,然后设计特征向量模型用于癌症致病基因预测。向量特征包括候选基因及其近邻的差异表达信息。从TCGA,OMIM和GEO等公共数据库获取癌症相关的致病与非致病基因以及患病前后基因差异表达数据进行实验,利用邻接网络中基因及其近邻的差异表达信息进行癌症致病基因预测(Differential Information of Gene and Nearest Neighbor for Cancer Pathogenic Gene Prediction,DICPG)。实验结果表明,DICPG癌症基因分类模型的生物学意义明显,分类精度和AUC等性能指标优于同类方法。Cancer is the first killer of human health.With the rapid development of sequencing technology,a massive amount of cancer gene expression data has been accumulated,and using computational methods to predict pathogenic genes has become a new hotspot in cancer research.However,currently,the prediction of pathogenic genes is mostly based on gene interaction networks,and little consideration is given to the potential connection between local network connections and differential gene expression.In response to the above issues,this paper first utilizes gene expression difference data before and after the disease,calculates the correlation between genes through mutual information,and constructs an adjacency network.Then,a feature vector model is designed for predicting cancer pathogenic genes.Vector features include differential expression information of candidate genes and their neighbors.Cancer-related pathogenic and non pathogenic genes are obtained from public databases such as TCGA,OMIM,and GEO,as well as differential expression data of genes before and after illness,for experiments.Differential expression information of genes and their neighbors in adjacency networks are used for cancer pathogenic gene prediction(DICPG).The experimental results show that the DICPG cancer gene classification model has significant biological significance,and its classification accuracy and AUC performance indicators are superior to similar methods.

关 键 词:基因差异表达数据 邻接网络 候选基因 基因特征向量 癌症致病基因预测 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象