检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李晓鹏 凌诚 高敬阳[1] LI Xiaopeng;LING Cheng;GAO Jingyang(School of Information Science and Technology,Beijing University of Chemical Technology,Beijing 100000,China;China Overseas International Center,Advanced Micro Devices,Inc(AMD),Beijing 100000,China)
机构地区:[1]北京化工大学信息科学与技术学院,北京100000 [2]中海国际中心超威半导体,北京100000
出 处:《计算机科学》2023年第12期322-329,共8页Computer Science
基 金:北京市自然科学基金(5182018)。
摘 要:随着现代分子序列数据越来越丰富,描述物种间历史关系的树状拓扑空间也急剧扩大,系统发育树的可靠推断仍面临着巨大挑战。近年来,马尔可夫链蒙特卡洛算法(MCMC)家族中最先进的哈密顿马尔可夫蒙特卡洛(HMC)算法被证明可以应用于系统发育分析,可以避免传统MCMC算法中存在的大量随机游走行为,加快马氏链的混合。但在更为复杂的多模态发育树空间中,HMC算法无法通过从其他模式中获得提议来逃离局部的高概率区域,为了提升算法的健壮性,文中提出了一种混合路径哈密顿马尔可夫蒙特卡洛(MPHMC)的优化方法。在不增加额外的计算成本的情况下,所提算法采样路径中添加针对离散参数的非HMC更新组件,与HMC确定性更新交替进行,进而在树空间中引入了拓扑变化更大的分支重排策略,能更自由地遍历整个后验分布的树空间。在5组经验数据集上进行实验,结果证明,MPHMC方法能更好地从正确的后验分布中采样;在比较难采样的大数据集上运行时,HMC单一路径的采样算法可能会失效,而MPHMC方法能获得比使用广泛的系统发育分析工具Mrbayes(MCMC)高14%以上的采样效率。With the increasing abundance of modern molecular sequence data and the dramatic expansion of the tree-like topological space describing historical relationships between species,reliable inference of phylogenetic trees continues to face enormous challenges.In recent years,the most advanced Hamiltonian Markov Monte Carlo(HMC)algorithm in the Markov Chain Monte Carlo(MCMC)family has been shown to be applicable to phylogenetic analysis,which can avoid the large amount of random walk behaviors present in traditional MCMC algorithms and speed up the mixing of Markov chains.However,in the more complex multimodal development tree space,the HMC algorithm cannot escape from the local high probability region by obtaining propo-sals from other modes.In order to improve the robustness of the algorithm,a hybrid path Hamiltonian Markov Monte Carlo(MPHMC)optimization strategy is proposed in this paper.Without adding additional computational cost,the algorithm samples paths with a non-HMC update component for discrete parameters,alternating with HMC deterministic updates,and introduces a branch rearrangement strategy with greater topological variation in the tree space,enabling freer traversal of the entire posterior distribution's tree space.Experiments on five empirical datasets demonstrate that the MPHMC method better samples from the correct posterior distribution,and the HMC single-path sampling algorithm may fail when run on larger datasets that are more difficult to sample,while the MPHMC method achieves a sampling efficiency gain over 14%than the widely used phylogenetic analysis tool,Mrbayes(MCMC).
关 键 词:MRBAYES 树空间 哈密顿马尔可夫蒙特卡洛(HMC) 多模态后验分布 混合路径
分 类 号:TP399[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7