基于块排序索引的生物序列局部比对查询技术  

Block Sorting Index-based Techniques for Local Alignment Searches on Biological Sequences

在线阅读下载全文

作  者:李永光[1] 王镝[1] 王国仁[1] 马宜菲[1] 

机构地区:[1]东北大学信息科学与工程学院计算机系统研究所,沈阳110004

出  处:《计算机科学》2005年第12期159-163,205,共6页Computer Science

基  金:教育部高等学校优秀青年教师学科研奖励计划基金资助项目;国家自然科学基金(60273079;60473074)资助

摘  要:生物数据库中的查询是在生物序列数据集中查找与输入查询序列相似的目标,目前的一些流行工具如BLAST等,是利用启发式算法来提高查询的速度。然而,这些启发式算法无法找到所有的满足要求的结果,而一些精确算法,如动态规划算法,却需要非常高昂的代价。最近,一种新的技术,QASIS,提出了在后缀树的遍历中使用动态规划的精确查找算法,其性能与BLAST相当。但是它的主要缺点就是后缀树这种索引结构需要巨大的空间开销。本文采用基于无损压缩的块排序结构来索引超常的生物序列,减小索引的存储空间开销,有效地减少动态规划算法的计算代价。实验结果表明基于块排序索引的算法在性能方面优于OASIS算法。A common query against large protein and input query sequence. The current set popular search gene sequence data sets is to locate targets that are similar to an tools, such as BLAST, employ heuristics to improve the speed of such searches. However, such heuristics can sometimes miss targets, which in many cases is undesirable. The alternative to BLAST is to use an accurate algorithm, Such as Snfith-Waterman(S-W) algorithm. However, these accurate algorithms are computationally very expensive. Recently, a new technique, OASIS, has been proposed to improve the efficiency and accuracy by employing dynamical programming during traversing suffix tree and its speed is comparable to BLAST. But its main drawback is too much memory consuming. We propose an efficient and accurate algorithm for locally aligning genome sequences. We construct a block sorting index structure for the large sequence. The index structure is less than the suffix tree index and can be fit for large data size. Experimental results show that our algorithm has better performance than OASIS.

关 键 词:生物序列 块排序索引 精确度 

分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论] TP311.12[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象