检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]四川大学计算机学院,成都610065 [2]四川大学锦江学院计算机学院,眉山620860
出 处:《四川大学学报(自然科学版)》2017年第2期280-286,共7页Journal of Sichuan University(Natural Science Edition)
基 金:四川省科技厅支撑项目(2012GZ0091;2013GZX0138);四川大学青年教师科研启动基金(2015SCU11050)
摘 要:后缀树和后缀数组广泛用于生物信息学领域中,特别是通过启发式算法在对DNA基因片段进行匹配的阶段.本文提出了在GPU的平台下,利用多核和超多核体系构成的后缀树以及后缀数组并行匹配大规模基因片段,从而加速基因搜索匹配过程.相对于后缀树,后缀数组二分搜素算法具有内存占用少,缓存使用率高等优点.在GPU的性能评估中,后缀数组执行效率明显超过后缀树,后缀数组占用的空间仅为后缀树的20%~30%.相对于CPU的串行实现,后缀树组达到了约99倍的加速比.实验结果表明在基因片段匹配的过程中,基于GPU的后缀数组二分搜索是一种高效且实用的方法.Suffix trees and suffix arrays have been used widely in bioinformatics applications, especially for DNA sequence alignments in the initial exact match phase of heuristic algorithms. In this paper, a new GPU implementation and optimization of the suffix tree and suffix array on both multi-core and many-core platforms to accelerate multiple genome sequence searching is presented. The comparative performance evaluation between the suffix tree and suffix array is then carried out. The results showed that the suffix array needed only 20%-30% of memory space compared with the suffix tree, and that the mean search time of the suffix array was significantly shorter than the mean search time of the suffix tree because of the use of a binary search with coalesced memory access and tile optimization under the GPU architecture. Moreover, the GPU implementation of the suffix array gained a speedup of approxi mate 99 times compared with the corresponding CPU serial implementation. This study showed that the massively parallel sequence matching algorithm based on suffix array was an efficient approach with the high-performance in the process of multiple DNA sequence matching.
分 类 号:TP316[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.216.230.65