面向异构众核从核的数学函数库访存优化方法  被引量:6

Access Optimization Technique for Mathematical Library of Slave Processors on Heterogeneous Many-core Architectures

在线阅读下载全文

作  者:许瑾晨[1] 郭绍忠[1] 黄永忠[1] 王磊[1] 

机构地区:[1]解放军信息工程大学数学工程与先进计算国家重点实验室,郑州450002

出  处:《计算机科学》2014年第6期12-17,共6页Computer Science

摘  要:数学库函数算法的特性致使函数存在大量的访存,而当前异构众核的从核结构采用共享主存的方式实现数据访问,从而严重影响了从核的访存速度,因此异构众核结构中数学库函数的性能无法满足高性能计算的要求。为了有效解决此问题,提出了一种基于访存指令的调度策略,亦即将访存延迟有效地隐藏于计算延迟中,以提高基于汇编实现的数学函数库的函数性能;结合动态调用方式,利用从核本地局部数据存储空间LDM(local data memory),提出了一种提高访存速度的ldm_call算法。两种优化技术在共享存储结构下具有普遍适用性,并能够有效减少函数访存开销,提高访存速度。实验表明,两种技术分别能够平均提高函数性能16.08%和37.32%。Due to the nature of mathematical function's algorithms,there are a great deal of access operations remaing in reality.In the heterogeneous many-core architectures,which is becoming ubiquitous recently,the slave processors are equipped with shared memory to access data,thereby impacting the accessing rate heavily.Therefore,the performance of the mathematical library' s functions is not able to meet requirements of high performance computing.To efficiently solve this problem,this study proposesd a novel accessing instructions based scheduling strategy to cover the access delay with the necessary computation.With the help of the dynamic calling mode,an algorithm called ldm_call was introduced based on the LDM (local data memory) of the slave processors,which can speed up the accessing rate significantly.These two optimizing technologies both possess general applicability in the shared memory.At the same time,they can efficiently reduce the accessing frequency and speed up the accessing rate.The experimental results show that they can improve the functions' performance 16.08% and 37.32% on average respectively.

关 键 词:异构众核 数学函数库 访存优化 指令调度 局部数据存储空间 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象