基于频繁子树挖掘的DNA重复序列识别方法被引量：2

Algorithm of Identification the DNA Repeat Sequence Based on Frequent Subtree Mining

出　　处：《微电子学与计算机》2011年第9期193-196,201,共5页Microelectronics & Computer

基　　金：国家自然科学基金项目(30671639);江苏省自然科学基金项目(BK2009393);江苏省青蓝工程学术带头人项目

摘　　要：提出了一种基于频繁子树挖掘策略说我DNA重复序列识别方法.绕开了传统的序列比对方式,将序列按照后缀树结构方式进行组织,再对后缀树形式做了约减改进,使其更加适合子树挖掘操作,最后利用频繁子树挖掘的方法对其进行学习.算法可以直接识别出满足设定阈值的重复序列,避免了由短重复体拼接所造成的时间浪费,设计的"二次识别技术"使得算法对模糊重复体也有着很好的识别效果,提高了识别完整度.实验证明:算法在识别效率性能方面较升,尤其当识别较长重复体时,优势体现的更为明显,同时在识别完整度方面也高度可比.The proposed algorithm is based on the thinking of the frequent subtree mining repetitive DNA sequences in the body identified.The organization of DNA sequences in the new algorithm is different from with the others;organized a sequence as a tree,so we could avoid alignment as those traditional methods,then improved trees more simple that could be operating by frequent subtree mining,used a kind of algorithm for mining frequent subtree to learn these trees.This new algorithm could find out the repeated sequences which meet the threshold set directly,avoid the wasting of time result of splicing the short sequences.Designed the new technology ＂secondary identification＂,which could find out the fuzzy repetitive sequences,also improved integrity of identification.Experiment show that our mothod improved the time efficiency compared with mainstream algorithms,especially learning to find out some long sequences and highly comparable on the integrity of identification.

关键词：DNA序列重复体识别频繁子树挖掘

分类号：TP311[自动化与计算机技术—计算机软件与理论]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于频繁子树挖掘的DNA重复序列识别方法被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于频繁子树挖掘的DNA重复序列识别方法 被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于频繁子树挖掘的DNA重复序列识别方法被引量：2