Smith-Waterman算法优化改进与Spark并行化研究  被引量:2

RESEARCH ON OPTIMIZATION IMPROVEMENT AND SPARK PARALLELIZATION OF SMITH-WATERMAN ALGORITHM

在线阅读下载全文

作  者:李雷孝 刘燕凤 高静[3,4] LI Leixiao;LIU Yanfeng;GAO Jing(College of Data Science and Application,Inner Mongolia University of Technology,Hohhot 010080,China;Inner Mongolia Autonomous Region Engineering&Technology Research Center of Big Data Based Software Service,Hohhot 010080,China;College of Computer and Information Engineering,Inner Mongolia Agricultural University,Hohhot 010018,China;Inner Mongolia Autonomous Region Key Laboratory of Big Data Research and Application for Agriculture and Animal Husbandry,Hohhot 010018,China)

机构地区:[1]内蒙古工业大学数据科学与应用学院,呼和浩特010080 [2]内蒙古自治区基于大数据的软件服务工程技术研究中心,呼和浩特010080 [3]内蒙古农业大学计算机与信息工程学院,呼和浩特010018 [4]内蒙古自治区农牧业大数据研究与应用重点实验室,呼和浩特010018

出  处:《内蒙古农业大学学报(自然科学版)》2019年第5期76-85,共10页Journal of Inner Mongolia Agricultural University(Natural Science Edition)

基  金:国家自然科学基金项目(61462070);内蒙古农业大学博士研究基金项目(BJ09-44)

摘  要:Smith-Waterman算法是1种精确度最高、广泛应用于文本搜索的生物学序列比对算法。在对Smith-Waterman算法深入研究的基础上,从减少计算任务量和降低计算复杂度两个方面对算法进行优化改进,将优化改进算法基于Spark平台进行算法并行化设计,并通过准确性测试、算法运行速度测试、算法速度比较测试、算法可扩展性测试等实验分析优化改进算法和并行化算法的性能。实验结果表明:优化改进和并行化后的算法在保证准确性的前提下,极大地提高了算法运行速度和可扩展性。Smith-Waterman is a biological sequence alignment algorithm with the highest accuracy and it is widely used in text search.Based on the in-depth study of Smith-Waterman algorithm,this paper optimizes and improves the algorithm from the aspects of reducing the number of computing tasks,the computational complexity.On the basis of the Spark platform,the optimized improved algorithm is designed in parallel.The performance of the improved algorithm and the parallel algorithm are analyzed by experiments such as accuracy test,algorithm running speed test,algorithm speed comparison test,algorithm extensibility test and so on.The experimental results show that the optimized and parallelized algorithm greatly improves the running speed and scalability of the algorithm on the premise of ensuring the accuracy.

关 键 词:基因序列比对 SMITH-WATERMAN算法 优化改进 Spark并行化 

分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象