基于Spark云计算的生物基因多序列比对方法  

Multiple Sequence Alignment Method for Biological Genes Based on Spark Cloud Computing

在线阅读下载全文

作  者:杨波 陈洋广 徐胜超 YANG Bo;CHEN Yangguang;XU Shengchao(School of Data Science,Guangzhou HuaShang College,Guangzhou 511300,China;School of Accountancy,Guangzhou HuaShang College,Guangzhou 511300,China)

机构地区:[1]广州华商学院数据科学学院,广州511300 [2]广州华商学院会计学院,广州511300

出  处:《计算机测量与控制》2024年第7期274-279,287,共7页Computer Measurement &Control

基  金:国家自然科学基金面上项目(61972444);广州华商学院校内科研导师制项目资助(2023HSDS34)。

摘  要:在生物基因多序列比对过程中,早期的方法仅计算了单一的Spark集群参数,导致算法的并行效果较差;为此,设计了基于Spark云计算的生物基因多序列比对方法;基于获得的生物遗传序列数据,对其进行了优化,并通过计算不同序列间的匹配度,对生物基因多序列比对任务进行动态规划;利用Spark云计算技术,构建Spark集群,并对多个Spark集群的参数进行计算;利用多种生物基因序列之间的相似性与差异性来选择最佳的匹配路径,在此基础上,建立多个生物基因序列比对的并行计算模型,并对其进行求解,得到对应的多个序列对比对的并行算法;实验结果表明:该方法具有更好的并行性,能够有效提高多序列比对的性能。In the multi sequence alignment process of biological genes,early algorithms only calculate a single Spark cluster parameter,resulting in poor parallel performance of the algorithms.For this purpose,a multi sequence alignment parallel algorithm for biological genes based on Spark cloud computing was designed.The obtained biological genetic sequence data was optimized,and the dynamic planning of the biological gene multi sequence alignment was carried out by calculating the matching degree between different sequences.Spark cloud computing technology was used to build Spark clusters and calculate the parameters of multiple Spark clusters.By utilizing the similarities and differences between multiple biological gene sequences,the optimal matching path was selected.On this basis,the parallel computing model for multiple biological gene sequences was established and solved,and the corresponding parallel algorithm for aligning multiple sequences was obtained.Experimental results show that the algorithm has better parallelism and can effectively improve the performance of multiple sequence alignment.

关 键 词:Spark云计算 生物基因 生物信息学 基因多序列比对 并行算法 

分 类 号:TP393.4[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象