检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:郭益浩 李婧[1] Guo Yihao;Li Jing(School of Life Sciences and Biotechnology,Shanghai Jiao Tong University,Shanghai,200240)
机构地区:[1]上海交通大学生命科学技术学院,上海200240
出 处:《基因组学与应用生物学》2021年第2期909-915,共7页Genomics and Applied Biology
基 金:国家自然科学基金项目(31871329);上海市自然科学基金项目(17ZR1413900)共同资助。
摘 要:基于质谱数据的蛋白质定量分析一直是目前高通量蛋白质组学的重要研究手段。但是基于现有质谱技术的限制,大规模蛋白质定量过程中往往会产生大量的缺失值,这在一定程度上影响了下游分析的准确性。尽管很多缺失值填补方法被不断提出,但是蛋白质组学领域对于不同情况下缺失值填补方法效力的综合评估仍然缺乏。本研究基于真实数据的分布特征,构建模拟数据集,在样本量、效应值以及缺失比例这三个维度上,综合评估了kNN、SVD、MLE、BPCA、LLS、Min、QRILC、Mean这8种缺失值填补方法的效力。结果显示,填补效力与样本量和效应值呈正相关,也与缺失比例呈负相关。同时,还发现在不同数据集中填补方法的效力有所差异,研究者需要根据数据集特征和自身需求选择适合的填补方法。本研究总结了不同数据集特征下的最优填补方法,供研究者进行参考和使用。Quantitative protein analysis based on mass spectrometry is an important research methodology for high-throughput proteomics. However, due to the limitations of existing mass spectrometry techniques, large-scale quantification process may produce a large number of missing values, which will affect the accuracy of downstream analysis. Although many imputation methods have been proposed, comprehensive evaluation upon those methods in different situations is still lacking in proteomics. Here, based on the characteristics of real data, we constructed different simulation datasets in three dimensions of sample size, effect value, and missing ratio. Then we comprehensively evaluated the imputation effectiveness and accuracy of eight classical methods including kNN,SVD, MLE, BPCA, LLS, Min, QRILC and Mean. The results illustrated that the effectiveness of missing imputation is positively correlated to sample size and effect value, while negatively correlated to missing proportion. We also found that the effectiveness of those methods is varied in different datasets. Researchers need to choose a suitable imputation method according to the characteristics of dataset and their own needs. In this research, we summarize the optimal methods for different characteristics to provide reference for researchers.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.30