检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]南开大学数学科学学院与LPMC,天津300071
出 处:《工程数学学报》2003年第3期117-124,共8页Chinese Journal of Engineering Mathematics
基 金:国家自然科学基金(10271061);天南大联合研究项目;刘徽应用数学研究中心.
摘 要:蛋白质二级结构预测问题自1957年首次被提出迄今已有40多年了,从知道的文献中可以得出如下信息:在统计意义之下,蛋白质序列中氨基酸之间的相互作用较弱,所以,统计方法中所依赖的独立性假设虽然不是从物理背景中得来的,但的确有其合理性和方便之处;交互信息准则优于均方误差准则;信息和统计的思想和方法在预测二级结构中不可低估;加入蛋白质的一级结构之外的信息可帮助提高二级结构预测的精度;而直接从一级结构出发无附加信息的情况下预测二级结构,现存在的预测方法的预测精度仍然无较大突破;预测精度和所使用的蛋白质样本序列在总体样本中的覆盖率,是评估各种预测方法的有效性的两个重要指标。本文作者建立了一个集蛋白质一、二级结构为一体联合结构模型,并将上述信息囊括在其中。由该模型首先得到蛋白质一、二级结构的信息与统计特性,然后利用这些特性分别对蛋白质一、二级结构中各种变量的信息传递关系及隐Markov性进行定量分析和确切地统计描述。最后给出直接从一级结构出发预测二级结构的几个原则。It is 45 years from 1957 when the problem of prediction of the secondary structure was proposed by first time. According to the part of the extant references, the authors find the following statements: in the sense of the statistics, the interactions among the amino acids in the amino acid sequence is very weak. That is, the hypothesis that the amino acid sequence is an identically and independently distributed process which is used in all informational and statistical methods is convenience in the sense of mathematics even though lacking the biological background. The criterion of the mutual information is better than that of the square errors. The roles of the statistical ideas and methods can not be underestimated to predict the secondary structure of proteins, It is helpful to improve the prediction accuracy of the secondary structure if we add the information besides the primary structure. There is no too much breakthrough of the prediction accuracy if we only depend on the local information of the primary structure. Both the prediction accuracy and the covered rate, the number of the sample set ratio to the number of total proteins in database of PDB, are important indexes to check the effect of the prediction methods. The authors establish a joined 2dimensional stochastic sequence consisting by the primary sequence and the sequence of the states of secondary structure such that all statements are enclosed in the model. Accordingly, the authors get the informational and the statistical characters of the primary and secondary structure of the proteins at first, then give the numeric analysis of the informational transport of the variables among the primary and secondary structure and exactly statistical expression of the hidden Markovity. Finally, the authors give out some rules to predict the secondary structure based on the primary structure.
关 键 词:蛋白质一、二级结构的联合结构模型 三肽链 二级结构预测精度和覆盖率 隐Markov性
分 类 号:O236[理学—运筹学与控制论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.3