检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:汪清[1,2] 顾乃杰[1,2] 何颂颂[1,2] 杨阳朝[1,2]
机构地区:[1]中国科学技术大学计算机学院,合肥230027 [2]安徽省计算与通信软件重点实验室,合肥230027
出 处:《小型微型计算机系统》2014年第6期1207-1211,共5页Journal of Chinese Computer Systems
基 金:国家"核高基"重大专项(2009ZX01028-002-003-005)资助;国家自然科学基金项目(60833004)资助
摘 要:针对SCC(Single-Chip Cloud Computer,单芯片云计算机)体系结构,通过通信路由的改进、消息传递的预处理以及数据处理的再划分这三种手段来提升FFT并行实现效率并以此来研究SCC的扩展性.实验结果表明,SCC上改进后的FFT在一定规模内,双核下的平均加速比为4.10倍,最高可达4.78倍;四核下平均加速比为6.01倍,最高可达6.77倍;八核下平均加速比为10.46倍,最高可达11.53倍;十六核下平均加速比为16.20倍,最高可达18.51倍;三十二核下平均加速比为21.17倍,最高可达到24.20倍.并且随着规模的增加,核间通信带宽趋于稳定,三十二核的加速比也逐渐增大,结果显示SCC具有良好的可扩展性.According to the characteristics of SCC architecture, this paper shows three ways to improve the Parallel efficiency of FFT and study the expansibility of SCC. improvement of the communication routing, message pretreatment and the division of data pro- cessing. The experimental results show that the improved FFT on SCC chip in the certain scale, 2cores can get 4.10x speedup in aver- age and even can achieve 4.78x speedup at highest;4cores can get 6.01x speedup in average and even can achieve 6.77x speedup at highest. ;8cores can get 10.46x speedup in average and even can achieve 11.53x speedup at highest. ;16-core can get 16.20x speedup in average and even can achieve 18.51 x speedup at highest;32-core can get 21.17x speedup in average and even can achieve 24.20x speedup at highest. And with the increase of the scale, nuclear communication bandwidth tends to be stable, the speedup of 32-Cores also gradually increasing, So the results showed that the SCC has good expansibility.
关 键 词:FFT SCC RCCE 并行化 加速比 扩展性
分 类 号:TP301[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.46