检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:Yumeng Zhang Jiahao Guan Chen Li Zhikang Wang Zixin Deng Robin BGasser Jiangning Song Hong-Yu Ou
机构地区:[1]state Key Laboratory of Microbial Metabolism,Joint International Laboratory on Metabolic&Developmental Sciences,School of Life Sciences and Biotechnology,Shanghai Jiao Tong University,Shanghai 200240,China [2]Shanghai Key Laboratory of Veterinary Biotechnology,Shanghai Jiao Tong University,Shanghai 200240,China [3]Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology,Monash University,Melbourne,ViC 3800,Australia [4]Monash Data Futures Institute,Monash University,Melbourne,ViC 3800,Australia [5]Melbourne Veterinary School,Faculty of Science,The University of Melbourne,Parkville,VIC 3010,Australia.
出 处:《Research》2024年第3期243-257,共15页研究(英文)
基 金:the National Natural Science Foundation of China(32070572);the Foundation of Key Laboratory of Veterinary Biotechnology(shklab202005);Shanghai,China,and the Science and Technology Commission of Shanghai Municipality(19JC1413000);R.B.G.and J.S.were supported by grants from the Australian Research Council(ARC)(LP220200614).
摘 要:Proteins secreted by Gram-negative bacteria are tightly linked to the virulence and adaptability of these microbes to environmental changes.Accurate identification of such secreted proteins can facilitate the investigations of infections and diseases caused by these bacterial pathogens.However,current bioinformatic methods for predicting bacterial secreted substrate proteins have limited computational efficiency and application scope on a genome-wide scale.Here,we propose a novel deep-learning-based framework—DeepSecE—for the simultaneous inference of multiple distinct groups of secreted proteins produced by Gram-negative bacteria.DeepSecE remarkably improves their classification from nonsecreted proteins using a pretrained protein language model and transformer,achieving a macro-average accuracy of 0.883 on 5-fold cross-validation.Performance benchmarking suggests that DeepSecE achieves competitive performance with the state-of-the-art binary predictors specialized for individual types of secreted substrates.The attention mechanism corroborates salient patterns and motifs at the N or C termini of the protein sequences.Using this pipeline,we further investigate the genome-wide prediction of novel secreted proteins and their taxonomic distribution across~1,000 Gram-negative bacterial genomes.The present analysis demonstrates that DeepSecE has major potential for the discovery of disease-associated secreted proteins in a diverse range of Gram-negative bacteria.An online web server of DeepSecE is also publicly available to predict and explore various secreted substrate proteins via the input of bacterial genome sequences.
关 键 词:DEEP SERVER COMPETITIVE
分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.38