检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:Yunlong Liu Morteza H.Ghaffari Tao Ma Yan Tu
机构地区:[1]Key Laboratory of Feed Biotechnology of the Ministry of Agricultural and Rural Affairs,Institute of Feed Research,Chinese Academy of Agricultural Sciences,Beijing 100081,China [2]Institute of Animal Science,Physiology Unit,University of Bonn,Bonn 53115,Germany
出 处:《aBIOTECH》2024年第4期465-475,共11页生物技术通报(英文版)
基 金:supported by the Central Public-Interest Scientific Institution Basal Research Fund of the Chinese Academy of Agricultural Sciences(Y2022QC10);Agricultural Sciences and Technology Innovation Program of the Chinese Academy of Agricultural Sciences(CAAS-IFRZDRW202404,CAAS-ASTIP-2023-IFR-04).
摘 要:Accurate taxonomic classification is essential to understanding microbial diversity and function through metagenomic sequencing.However,this task is complicated by the vast variety of microbial genomes and the computational limitations of bioinformatics tools.The aim of this study was to evaluate the impact of reference database selection and confidence score(CS)settings on the performance of Kraken2,a widely used k-mer-based metagenomic classifier.In this study,we generated simulated metagenomic datasets to systematically evaluate how the choice of reference databases,from the compact Minikraken v1 to the expansive nt-and GTDB r202,and different CS(from 0 to 1.0)affect the key performance metrics of Kraken2.These metrics include classification rate,precision,recall,F1 score,and accuracy of true versus calculated bacterial abundance estimation.Our results show that higher CS,which increases the rigor of taxonomic classification by requiring greater k-mer agreement,generally decreases the classification rate.This effect is particularly pronounced for smaller databases such as Minikraken and Standard-16,where no reads could be classified when the CS was above 0.4.In contrast,for larger databases such as Standard,nt and GTDB r202,precision and F1 scores improved significantly with increasing CS,highlighting their robustness to stringent conditions.Recovery rates were mostly stable,indicating consistent detection of species under different CS settings.Crucially,the results show that a comprehensive reference database combined with a moderate CS(0.2 or 0.4)significantly improves classification accuracy and sensitivity.This finding underscores the need for careful selection of database and CS parameters tailored to specific scientific questions and available computational resources to optimize the results of metagenomic analyses.
关 键 词:METAGENOME Taxonomic classification Kraken2 Reference database Confidence score
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15