多信息融合和自注意力识别新冠磷酸化位点  

Multi-information fusion and self-attention identificationof phosphorylation sites of SARS-CoV-2

在线阅读下载全文

作  者:闫路 来佳丽 王明辉[1] YAN Lu;LAI Jiali;WANG Minghui(School of Mathematics and Physics,Qingdao University of Science and Technology,Qingdao 266042,China)

机构地区:[1]青岛科技大学数理学院,山东青岛266042

出  处:《重庆理工大学学报(自然科学)》2023年第6期242-248,共7页Journal of Chongqing University of Technology:Natural Science

基  金:国家自然科学基金面上项目(12171210)。

摘  要:由严重急性呼吸系统综合症冠状病毒2 (SARS-CoV-2)引起的疾病正在威胁着人们的健康。识别磷酸化位点是理解感染新型冠状病毒的分子机制的重要步骤。由于实验方法的局限性,建立有效的预测模型是非常有必要的,由此提出一种新的新冠磷酸化位点预测模型Self-DeepIPs。利用二肽组成(DC),增强氨基酸组成(EAAC),组成、转化和分布(CTD),BLOSUM62四种特征提取方法将蛋白质序列信息转化为数字信息,并首尾相连融合这些特征,采用互信息方法去除冗余信息。利用BILSTM和自注意力机制结合构建深度学习模型预测新冠磷酸化位点。利用五折交叉验证对模型进行检验。训练集的ACC和AUC值分别达到83.62%和91.70%,独立测试集的ACC和AUC值分别达到82.56%和91.23%。实验结果表明:Self-DeepIPs方法能够有效识别新冠磷酸化位点。The disease caused by severe acute respiratory syndrome coronavirus 2(SARS-CoV-2)is threatening people’s health and lives.Identifying phosphorylation sites is an important step in understanding the molecular mechanism of SARS-CoV-2.Due to the limitations of experimental methods,it is very necessary to establish effective prediction models.Therefore,a new SARS-CoV-2 phosphorylation site prediction model,Self-DeepIPs,is proposed.The protein sequence information is converted into digital information using dipeptide composition(DC),enhanced amino acid composition(EAAC),composition,transformation and distribution(CTD)and BLOSUM62.These features are also fused end-to-end,and the mutual information(MI)method is used to remove redundant information.The combination of BILSTM and the self-attention mechanism is used to build a deep learning model to predict the phosphorylation sites of the SARS-CoV-2.Then,five-fold cross-validation is used to test the model.The ACC and AUC values on the training set reach 83.62%and 91.70%respectively,and the ACC and AUC values on the independent test set reach 82.56%and 91.23%respectively.The experimental results show that the Self-DeepIPs method proposed in this paper can effectively identify SARS-CoV-2 phosphorylation sites.

关 键 词:新冠磷酸化 多信息融合 自注意力机制 深度学习 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象