检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:钱宏 王飞 刘沙[1] 郑天宇 宋佳伟 安虹[1] QIAN Hong;WANG Fei;LIU Sha;ZHENG Tian-Yu;SONG Jia-Wei;AN Hong(School of Computer Science and Technology,University of Science and Technology of China,Hefei 230026,China;Department of Computer Science and Technology,Tsinghua University,Beijing 100084,China;Zhejiang Lab,Hangzhou 311121,China;National Supercomputing Center in Wuxi,Wuxi 214000,China)
机构地区:[1]中国科学技术大学计算机科学与技术学院,合肥230026 [2]清华大学计算机科学与技术系,北京100084 [3]之江实验室,杭州311121 [4]国家超级计算无锡中心,无锡214000
出 处:《计算机系统应用》2024年第2期62-71,共10页Computer Systems & Applications
基 金:国家重点研发计划(2020YFB0204602)。
摘 要:申威异构众核处理器运算核心访问主存的延迟很大,程序中应尽量避免运算核心代码访问主存的操作.全局偏移表存放程序中全局变量和函数的地址,不适合保存在珍稀的运算核心局部存储空间中,并且其访问模式通常比较离散,因而也不适合对其做Cache预取,访问全局偏移表引入的访问主存操作对程序性能影响较大.本文针对异构众核程序静态链接与动态链接的使用场景,分析链接器relaxation优化的使用限制,通过“gp基地址+扩展偏移”的方法实现避免访问主存操作的全局符号重定位优化.实验结果表明,该重定位优化方法能够以增加少量代码为代价,在运算核心代码调用函数与访问全局变量时有效避免访问全局偏移表引入的访问主存的操作,提高众核程序的运行性能.The delay of the computing core access to the main memory of Shenwei heterogeneous many-core processors is very large,and thus the program should try to avoid the access of computing core code to main the memory as much as possible.The global offset table stores the addresses of global variables and functions in the program,which is not suitable to be saved in the rare local storage space of the computing core,and it is not suitable for cache prefetching because of its discrete access patterns.Therefore,accessing the main memory operation introduced by accessing the global offset table has a great influence on program performance.In view of the usage scenarios of static linking and dynamic linking of heterogeneous many-core programs,the usage limitations of linker relaxation optimization are analyzed,and a global symbol relocation optimization method is designed based on“gp address base+extended offset”to avoid accessing the main memory.Experimental results show that at the cost of adding a small amount of code,the relocation optimization method can effectively avoid the operation of accessing the main memory introduced by accessing the global offset table when the computing core code calls functions and accesses global variables,which improves the running performance of many-core programs.
关 键 词:众核处理器 全局偏移表 重定位 链接器优化 性能
分 类 号:TP332[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.17.156.160