单颗粒重构软件EMAN算法分析与高效并行实现  

Algorithm Analysis and Efficient Parallelization of the Single Particle Reconstruction Software Package:EMAN

在线阅读下载全文

作  者:樊莉亚[1,2] 张法[1] 王功明[1,2] 刘志勇[1] 

机构地区:[1]中国科学院计算技术研究所,北京100190 [2]中国科学院研究生院,北京100049

出  处:《计算机研究与发展》2010年第12期2165-2176,共12页Journal of Computer Research and Development

基  金:国家自然科学基金项目(90612019;60752001;60736012;60503060);中国科学院知识创新工程基金项目(KGGX1-YW-13)

摘  要:单颗粒重构技术是确定大分子三维结构的重要手段之一.近年来,由于其本身独有的一些优点,单颗粒重构技术受到越来越广泛的关注.然而其处理过程极其耗时,并且缺少高效的并行实现,极大地限制了该技术的应用.对当今使用最广泛的单颗粒重构软件EMAN进行了性能优化及并行加速.通过分析各部分的具体算法,发现其中最核心的问题是如何在低通信开销的前提下实现负载平衡.针对这一问题,提出了自适应动态调度算法.该算法不仅适合于EMAN,同样适合于其他类似的独立任务调度问题.实际运行结果表明,经过优化的串行程序运行时间减少11.50%.由于采用了自适应动态调度算法,提供的并行实现比EMAN自带的实现具有更高的加速比,其中最耗时的分类操作加速比接近线性.在16个处理器核上的总体并行效率比EMAN自带的并行实现高29.8%.因此,提供的并行实现可有效利用计算资源,显著缩短单颗粒重构所需时间.Single particle reconstruction is one of the most important technologies for determining three-dimensional structures of macromolecules. In recent years, it has been given more and more attention, because of some of its distinct features. Unfortunately, its application is greatly constrained, due to its extremely long processing time and lack of efficient parallel implementations. This study optimizes and parallelizes one of the most widely-used software packages for single particle reconstruction: EMAN. By analyzing algorithms of its major components, the authors find that the key problem is achieving ideal load balancing with low communication costs. A self-adaptive dynamic scheduling algorithm is introduced to solve this problem. It is not only applicable to EMAN, but also to other similar scheduling problems with independent tasks. Actual experiments show that through optimization, serial execution time of our implementation is 11.50% less than that of EMAN. Besides, thanks to the self-adaptive scheduling algorithm, our implementation produces much higher speedups than EMAN. Speedups of the most time-consuming classification component are close to linearity. Moreover, parallel efficiency of our implementation on 16 CPU cores is 29.8% higher, compared with the implementation of EMAN. Therefore, our implementation is capable of making full use of available computing resources, dramatically reducing the processing time of single particle reconstruction.

关 键 词:生物信息 并行计算 调度算法 EMAN 单颗粒重构 

分 类 号:TP311.56[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象