GPA: A Microbial Genetic Polymorphisms Assignments Tool in Metagenomic Analysis by Bayesian Estimation  被引量:1

GPA: A Microbial Genetic Polymorphisms Assignments Tool in Metagenomic Analysis by Bayesian Estimation

在线阅读下载全文

作  者:Jiarui Li Pengcheng Du Adam Yongxin Ye Yuanyuan Zhang Chuan Song Hui Zeng Chen Chen 

机构地区:[1]Beijing Key Laboratory of Emerging Infectious Diseases, Institute of Infectious Diseases, Beijing Ditan Hospital, Capital Medical University [2]Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University

出  处:《Genomics, Proteomics & Bioinformatics》2019年第1期106-117,共12页基因组蛋白质组与生物信息学报(英文版)

基  金:supported by the Beijing Municipal Science & Technology Commission (Grant No. Z161100000516021);the National Key R&D Program of China (Grant No. 2016YFC1200804);the National Natural Science Foundation of China (Grant Nos. 81571956 and 81702038)

摘  要:Identifying antimicrobial resistant(AMR) bacteria in metagenomics samples is essential for public health and food safety. Next-generation sequencing(NGS) technology has provided a powerful tool in identifying the genetic variation and constructing the correlations between genotype and phenotype in humans and other species. However, for complex bacterial samples, there lacks a powerful bioinformatic tool to identify genetic polymorphisms or copy number variations(CNVs) for given genes. Here we provide a Bayesian framework for genotype estimation for mixtures of multiple bacteria, named as Genetic Polymorphisms Assignments(GPA). Simulation results showed that GPA has reduced the false discovery rate(FDR) and mean absolute error(MAE) in CNV and single nucleotide variant(SNV) identification. This framework was validated by whole-genome sequencing and Pool-seq data from Klebsiella pneumoniae with multiple bacteria mixture models, and showed the high accuracy in the allele fraction detections of CNVs and SNVs in AMR genes between two populations. The quantitative study on the changes of AMR genes fraction between two samples showed a good consistency with the AMR pattern observed in the individual strains. Also, the framework together with the genome annotation and population comparison tools has been integrated into an application, which could provide a complete solution for AMR gene identification and quantification in unculturable clinical samples. The GPA package is available at https://github.com/IID-DTH/GPA-package.Identifying antimicrobial resistant(AMR) bacteria in metagenomics samples is essential for public health and food safety. Next-generation sequencing(NGS) technology has provided a powerful tool in identifying the genetic variation and constructing the correlations between genotype and phenotype in humans and other species. However, for complex bacterial samples, there lacks a powerful bioinformatic tool to identify genetic polymorphisms or copy number variations(CNVs) for given genes. Here we provide a Bayesian framework for genotype estimation for mixtures of multiple bacteria, named as Genetic Polymorphisms Assignments(GPA). Simulation results showed that GPA has reduced the false discovery rate(FDR) and mean absolute error(MAE) in CNV and single nucleotide variant(SNV) identification. This framework was validated by whole-genome sequencing and Pool-seq data from Klebsiella pneumoniae with multiple bacteria mixture models, and showed the high accuracy in the allele fraction detections of CNVs and SNVs in AMR genes between two populations. The quantitative study on the changes of AMR genes fraction between two samples showed a good consistency with the AMR pattern observed in the individual strains. Also, the framework together with the genome annotation and population comparison tools has been integrated into an application, which could provide a complete solution for AMR gene identification and quantification in unculturable clinical samples. The GPA package is available at https://github.com/IID-DTH/GPA-package.

关 键 词:Next-generation sequencing Pool-seq Bayesian model METAGENOMICS Genetic POLYMORPHISMS 

分 类 号:Q[生物学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象