检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:许瑾晨[1] 郭绍忠[1] 黄永忠[1] 王磊[1]
机构地区:[1]解放军信息工程大学数学工程与先进计算国家重点实验室,郑州450002
出 处:《计算机科学》2014年第6期12-17,共6页Computer Science
摘 要:数学库函数算法的特性致使函数存在大量的访存,而当前异构众核的从核结构采用共享主存的方式实现数据访问,从而严重影响了从核的访存速度,因此异构众核结构中数学库函数的性能无法满足高性能计算的要求。为了有效解决此问题,提出了一种基于访存指令的调度策略,亦即将访存延迟有效地隐藏于计算延迟中,以提高基于汇编实现的数学函数库的函数性能;结合动态调用方式,利用从核本地局部数据存储空间LDM(local data memory),提出了一种提高访存速度的ldm_call算法。两种优化技术在共享存储结构下具有普遍适用性,并能够有效减少函数访存开销,提高访存速度。实验表明,两种技术分别能够平均提高函数性能16.08%和37.32%。Due to the nature of mathematical function's algorithms,there are a great deal of access operations remaing in reality.In the heterogeneous many-core architectures,which is becoming ubiquitous recently,the slave processors are equipped with shared memory to access data,thereby impacting the accessing rate heavily.Therefore,the performance of the mathematical library' s functions is not able to meet requirements of high performance computing.To efficiently solve this problem,this study proposesd a novel accessing instructions based scheduling strategy to cover the access delay with the necessary computation.With the help of the dynamic calling mode,an algorithm called ldm_call was introduced based on the LDM (local data memory) of the slave processors,which can speed up the accessing rate significantly.These two optimizing technologies both possess general applicability in the shared memory.At the same time,they can efficiently reduce the accessing frequency and speed up the accessing rate.The experimental results show that they can improve the functions' performance 16.08% and 37.32% on average respectively.
关 键 词:异构众核 数学函数库 访存优化 指令调度 局部数据存储空间
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.119.13.56