检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]北京工业大学生命科学与生物工程学院,北京100124
出 处:《北京工业大学学报》2011年第7期1103-1109,共7页Journal of Beijing University of Technology
基 金:国家自然科学基金资助项目(30570427);北京市自然科学基金资助项目(4092008)
摘 要:以70种蛋白质折叠为研究对象,对每种折叠,选择序列同一性小于25%、样本量大于3的代表性蛋白质为训练集,采用机器和人工结合的办法进行结构比对,产生序列排比,经过训练得到了适合每种折叠的概形隐马尔科夫模型(profile HMM)用于该折叠类型的识别.对Astral1.65中的9 505个蛋白质结构域样本进行单模型识别,平均敏感性和特异性分别为91.93%和99.95%,Matthew相关系数为0.87.在折叠类型水平上,与Pfam和SUPERFAMILY单纯使用序列比对构建的HMM相比,所用模型数量显著减少,仍然保持很高的识别效果.结果表明:对序列相似度很低但具有相同折叠类型的蛋白质,可以通过引入结构比对的方法建立统一的HMM模型,实现高准确率的折叠类型识别.Based on the classification of SCOP, we chose 70 folding types. Each type consists of a subset of proteins( 〈 25% sequence identity) which have more than 4 samples. These sequences were aligned by structure alignment tool combining with manual inspection, and the sequence alignment result was used to generate a profile HMM of each fold. In the single model identify test on 9 505 sequences of Astral-1.65, the sensitivity and specificity of the profile HMM reach to 91.93% and 99.95% respectively, and the Matthew correlation coefficient is 0. 87. Compared with Pfam and SUPERFAMILY which construct HMM based on merely sequence alignment, the model number is significantly reduced, while keeping the sensitivity at the same level. The result show that, for those proteins with same fold type but low sequence identity, a unified HMM can be constructed by introducing structure alignment to implement fold identify with high accuracy.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222