利用隐马尔科夫模型识别蛋白质折叠类型  

Protein Fold Recognition Using Hidden Markov Model

在线阅读下载全文

作  者:李晓琴[1] 仁文科[1] 刘岳[1] 

机构地区:[1]北京工业大学生命科学与生物工程学院,北京100124

出  处:《北京工业大学学报》2011年第7期1103-1109,共7页Journal of Beijing University of Technology

基  金:国家自然科学基金资助项目(30570427);北京市自然科学基金资助项目(4092008)

摘  要:以70种蛋白质折叠为研究对象,对每种折叠,选择序列同一性小于25%、样本量大于3的代表性蛋白质为训练集,采用机器和人工结合的办法进行结构比对,产生序列排比,经过训练得到了适合每种折叠的概形隐马尔科夫模型(profile HMM)用于该折叠类型的识别.对Astral1.65中的9 505个蛋白质结构域样本进行单模型识别,平均敏感性和特异性分别为91.93%和99.95%,Matthew相关系数为0.87.在折叠类型水平上,与Pfam和SUPERFAMILY单纯使用序列比对构建的HMM相比,所用模型数量显著减少,仍然保持很高的识别效果.结果表明:对序列相似度很低但具有相同折叠类型的蛋白质,可以通过引入结构比对的方法建立统一的HMM模型,实现高准确率的折叠类型识别.Based on the classification of SCOP, we chose 70 folding types. Each type consists of a subset of proteins( 〈 25% sequence identity) which have more than 4 samples. These sequences were aligned by structure alignment tool combining with manual inspection, and the sequence alignment result was used to generate a profile HMM of each fold. In the single model identify test on 9 505 sequences of Astral-1.65, the sensitivity and specificity of the profile HMM reach to 91.93% and 99.95% respectively, and the Matthew correlation coefficient is 0. 87. Compared with Pfam and SUPERFAMILY which construct HMM based on merely sequence alignment, the model number is significantly reduced, while keeping the sensitivity at the same level. The result show that, for those proteins with same fold type but low sequence identity, a unified HMM can be constructed by introducing structure alignment to implement fold identify with high accuracy.

关 键 词:蛋白质 折叠类型识别 隐马尔科夫模型 结构比对 

分 类 号:O641[理学—物理化学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象