检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]辽宁师范大学,辽宁 大连
出 处:《自然科学》2021年第2期281-290,共10页Open Journal of Nature Science
摘 要:分子序列比较是生物信息学中最基本、最主要的问题,DNA序列相似性分析是研究的重要的课题。非比对方法是研究序列比较的方法之一,它克服了比对方法的局限,其计算速度更快。本文从前缀标识符位置角度出发,利用信息熵,提出了序列分析的非比对方法。本文通过对生物序列构建前缀树,得到生物序列前缀标识符的基础上,以两两序列的共同前缀标识符为研究对象,提取它们在序列中位置信息,将它们的位置差的绝对值看成随机变量,利用信息熵,提出新的DNA序列相似性度量方法,建立有效的模型。将70个哺乳动物的线粒体DNA序列作为实验数据集,应用该模型得到的相似性距离构建生物进化树。该进化树的分类结果符合当前的生物学分类标准。Comparison of molecular sequence is the most basic and important problem in bioinformatics. DNA sequence similarity analysis is an important research topic. Alignment-free method is one of the methods to study sequence comparison. It overcomes the limitation of alignment method and is faster than alignment method. In this paper, from the point of view of prefix identifier location, the alignment-free method of sequence analysis is proposed by using information entropy. Based on the prefix tree and the prefix identifier of biological sequences, the position information of pairwise sequences is extracted by using the common prefix identifiers of pairwise sequences. The absolute value of their position difference is regarded as random variable. Using information entropy, a new DNA sequence similarity measurement method is proposed and an effective model is established. Mitochondrial DNA sequences of 70 mammalian were used as experimental data sets. Construct the Phylogenetic tree based on the similarity distance obtained by the model. The classification results of Phylogenetic tree conform to the current biological classification.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.3