Continuous and Discrete Similarity Coefficient for Identifying Essential Proteins Using Gene Expression Data  被引量:1

在线阅读下载全文

作  者:Jiancheng Zhong Zuohang Qu Ying Zhong Chao Tang Yi Pan 

机构地区:[1]College of Information Science and Engineering,Hunan Normal University,Changsha 410081,China [2]Faculty of Computer Science and Control Engineering,Shenzhen Institute of Advanced Technology,Chinese Academy of Sciences Shenzhen,Guangzhou 518055,China

出  处:《Big Data Mining and Analytics》2023年第2期185-200,共16页大数据挖掘与分析(英文)

基  金:supported by the Shenzhen KQTD Project(No.KQTD20200820113106007);China Scholarship Council(No.201906725017);the Collaborative Education Project of Industry-University cooperation of the Chinese Ministry of Education(No.201902098015);the Teaching Reform Project of Hunan Normal University(No.82);the National Undergraduate Training Program for Innovation(No.202110542004).

摘  要:Essential proteins play a vital role in biological processes,and the combination of gene expression profiles with Protein-Protein Interaction(PPI)networks can improve the identification of essential proteins.However,gene expression data are prone to significant fluctuations due to noise interference in topological networks.In this work,we discretized gene expression data and used the discrete similarities of the gene expression spectrum to eliminate noise fluctuation.We then proposed the Pearson Jaccard coefficient(PJC)that consisted of continuous and discrete similarities in the gene expression data.Using the graph theory as the basis,we fused the newly proposed similarity coefficient with the existing network topology prediction algorithm at each protein node to recognize essential proteins.This strategy exhibited a high recognition rate and good specificity.We validated the new similarity coefficient PJC on PPI datasets of Krogan,Gavin,and DIP of yeast species and evaluated the results by receiver operating characteristic analysis,jackknife analysis,top analysis,and accuracy analysis.Compared with that of node-based network topology centrality and fusion biological information centrality methods,the new similarity coefficient PJC showed a significantly improved prediction performance for essential proteins in DC,IC,Eigenvector centrality,subgraph centrality,betweenness centrality,closeness centrality,NC,PeC,and WDC.We also compared the PJC coefficient with other methods using the NF-PIN algorithm,which predicts proteins by constructing active PPI networks through dynamic gene expression.The experimental results proved that our newly proposed similarity coefficient PJC has superior advantages in predicting essential proteins.

关 键 词:Protein-Protein Interaction(PPI)network continuous and discrete similarity coefficient essential proteins 

分 类 号:TP183[自动化与计算机技术—控制理论与控制工程] R96[自动化与计算机技术—控制科学与工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象