检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王海鹏[1] 付岩[1] 孙瑞祥[1] 贺思敏[1] 曾嵘[2] 高文[1]
机构地区:[1]中国科学院计算技术研究所数字化技术研究室 [2]中国科学院上海生命科学研究院生物化学与细胞生物学研究所,上海200031
出 处:《计算机研究与发展》2005年第9期1511-1518,共8页Journal of Computer Research and Development
基 金:国家"九七三"重点基础研究发展规划基金项目(2002CB713807);国家科技攻关计划基金项目(2004BA711A21)~~
摘 要:利用生物质谱技术进行肽蛋白质鉴定是蛋白质组学研究中的关键问题.提出了一种基于支持向量机(SVM)的肽鉴定算法pepReap.算法由粗细两层打分体系构成,粗打分利用匹配谱峰总强度和数目及肽长度等信息得到候选肽序列的列表,细打分通过SVM算法综合利用多项匹配指标如离子相关性、离子匹配误差、肽序列信息等对粗打分结果进行评价,得到更为可靠的肽鉴定结果.在SVM的参数选择过程中,采用马修斯相关系数来评价分类性能以适应不平衡数据集的情况.在公开发表的数据集上的实验表明,该算法与采用阈值评价方法的流行商业软件SEQUEST相比,在鉴定精度相当的情况下可以获得更高的鉴定灵敏度.Protein identification plays an important role in proteomics. An algorithm for peptide identification using support vector machines (SVM), pepReap, which consists of two-layered scoring scheme, is designed and implemented. First, a list of peptide candidates is obtained by coarse scoring calculated from total intensity and number of matched peaks, and peptide length. Second, the above preliminary peptide candidates are evaluated by an SVM-based scoring scheme using other important factors, such as correlations between ions, average match error, peptide sequence information, to improve the reliability of peptide identifications. Matthews correlation coefficient is used to measure the classification performance in the SVM training process in order to accommodate to unbalanced datasets. Experiments on a public dataset of tandem mass spectra demonstrate that the pepSeap algorithm outperforms the popular software SEQUEST which uses threshold evaluation in terms of identification sensitivity with comparable precision.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.145