APIR:Aggregating Universal Proteomics Database Search Algorithms for Peptide Identification with FDR Control  

在线阅读下载全文

作  者:Yiling Elaine Chen Xinzhou Ge Kyla Woyshner MeiLu McDermott Antigoni Manousopoulou Scott B.Ficarro Jarrod A.Marto Kexin Li Leo David Wang Jingyi Jessica Li 

机构地区:[1]Department of Statistics and Data Science,University of California,Los Angeles,CA 90095,USA [2]Department of Immuno-Oncology,Beckman Research Institute,City of Hope National Medical Center,Duarte,CA 91010,USA [3]Department of Quantitative and Computational Biology,University of Southern California,Los Angeles,CA 90089,USA [4]Department of Cancer Biology and Blais Proteomics Center,Dana-Farber Cancer Institute,Department of Pathology,Brigham and Women’s Hospital and Harvard Medical School,Boston,MA 02215,USA [5]Department of Pediatrics,City of Hope National Medical Center,Duarte,CA 91010,USA [6]Bioinformatics Interdepartmental Program,University of California,Los Angeles,CA 90095,USA [7]Department of Human Genetics,University of California,Los Angeles,CA 90095,USA [8]Department of Computational Medicine,University of California,Los Angeles,CA 90095,USA [9]Department of Biostatistics,University of California,Los Angeles,CA 90095,USA

出  处:《Genomics, Proteomics & Bioinformatics》2024年第2期171-187,共17页基因组蛋白质组与生物信息学报(英文版)

基  金:supported by the following grants:the National Cancer Institute,USA(a part of the National Institutes of Health,USA;Grant No.T32LM012424)to Yiling Elaine Chen;the National Cancer Institute,USA(Grant No.K08CA201591);the Margaret E Early Medical Research Trust,USA;the Pediatric Cancer Research Foundation,USA to Leo David Wang;the National Cancer Institute under Cancer Center Support Grant,USA(Grant No.P30CA033572)to the MS facility at the City of Hope;the National Institute of General Medical Sciences,USA(a part of the National Institutes of Health,USA;Grant Nos.R01GM120507 and R35GM140888);the National Science Foundation,USA(Grant Nos.DBI-1846216 and DMS-2113754);the Johnson&Johnson WiSTEM2D Award,USA,the Sloan Research Fellowship,USA;the UCLA David Geffen School of Medicine W.M.Keck Foundation Junior Faculty Award,USA,to Jingyi Jessica Li.

摘  要:Advances in mass spectrometry(MS)have enabled high-throughput analysis of proteomes in biological systems.The state-of-the-art MS data analysis relies on database search algorithms to quantify proteins by identifying peptide–spectrum matches(PSMs),which convert mass spectra to peptide sequences.Different database search algorithms use distinct search strategies and thus may identify unique PSMs.However,no existing approaches can aggregate all user-specified database search algorithms with a guaranteed increase in the number of identified peptides and a control on the false discovery rate(FDR).To fill in this gap,we proposed a statistical framework,Aggregation of Peptide Identification Results(APIR),that is universally compatible with all database search algorithms.Notably,under an FDR threshold,APIR is guaranteed to identify at least as many,if not more,peptides as individual database search algorithms do.Evaluation of APIR on a complex proteomics standard dataset showed that APIR outpowers individual database search algorithms and empirically controls the FDR.Real data studies showed that APIR can identify disease-related proteins and post-translational modifications missed by some individual database search algorithms.The APIR framework is easily extendable to aggregating discoveries made by multiple algorithms in other high-throughput biomedical data analysis,e.g.,differential gene expression analysis on RNA sequencing data.The APIR R package is available at https://github.com/yiling0210/APIR.

关 键 词:Shotgun proteomics Peptide–spectrum match Peptide identification Aggregation of lists FDR control 

分 类 号:Q78[生物学—分子生物学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象