检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
出 处:《生物工程学报》2009年第10期1508-1515,共8页Chinese Journal of Biotechnology
基 金:国家重点基础研究发展规划(973计划)(No.2007CB707804);国家自然科学基金(No.20806031)资助~~
摘 要:本研究系统分析了酸性、碱性和中性酶在二级结构氨基酸组成上的差异。结果发现在形成特定二级结构过程中,酸性酶和碱性酶有着不同的氨基酸使用偏向;同时,在酸性和碱性酶中,中性氨基酸和侧链微小的氨基酸含量明显较高,这可能是它们适应极端pH的普遍机制。基于此,提出了一种提取蛋白质序列特征值的新方法,其10倍交叉验证的精度可达80.3%。与其他常见特征值提取方法相比,其精度提高了9.4%到18.7%不等;而随机森林算法比其他机器学习算法识别精度也高出2.7%到21.8%不等。In this work, we systematically analyzed the secondary structure amino acid compositions of acidic and alkaline enzymes and compared them with neutral ones. We found that the propensity of the individual residues to participate in secondary structures and the consistently higher composition of neutral and tiny residues might be the general stability mechanisms for their adaptation to pH extremes. Based on this, we presented a secondary structure amino acid composition method for extracting useful features from sequence. The overall prediction accuracy evaluated by the 10-fold cross-validation reached 80.3%. Comparing our method with other feature extraction methods, the improvement of the overall prediction accuracy ranged from 9.4% to 18.7%. The random forests algorithm also outperformed other machine learning techniques with an improvement ranging from 2.7% to 21.8%.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.188.252.203