检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:周耀阳 韩博阳 蔺嘉炜 王凯帆 张林隽 余子濠 唐丹[1,3] 王卅 孙凝晖 包云岗[1,2] Zhou Yaoyang;Han Boyang;Lin Jiawei;Wang Kaifan;Zhang Linjuan;Yu Zihao;Tang Dan;Wang Sa;Sun Ninghui;and Bao Yungang(State Key Lab of Processors(Institute of Computing Technology,Chinese Academy of Sciences),Beijing 100190;School of Computer Science and Technology,University of Chinese Academy of Sciences,Beijing 100049;Beijing Institute of Open Source Chip,Beijing 100080;Department of Electrical and Electronic Engineering,The University of Hong Kong,Hong Kong 999077)
机构地区:[1]处理器芯片全国重点实验室(中国科学院计算技术研究所),北京100190 [2]中国科学院大学计算机科学与技术学院,北京100049 [3]北京开源芯片研究院,北京100080 [4]香港大学电机电子工程系,中国香港999077
出 处:《计算机研究与发展》2023年第6期1246-1261,共16页Journal of Computer Research and Development
基 金:中国科学院战略性先导科技专项(XDC05030200),国家自然科学基金重大项目(62090020)。
摘 要:在高性能处理器开发中,准确而快速的性能估算是设计决策和参数选择的基础.现有工作通过采样算法和RTL的体系结构检查点加速了处理器RTL仿真,使得在数天内测算复杂高性能处理器的SPECCPU等基准测试的性能成为可能.但是数天的迭代周期仍然过长,性能测算周期仍然有进一步缩短的空间.在处理器RTL仿真过程中,预热过程的时间占比很大.HyWarm框架的提出是为了加速性能测算过程中的预热过程.HyWarm通过微结构模拟器分析负载预热需求,为每个负载定制预热方案.对于缓存预热需求较大的负载,HyWarm通过总线协议进行RTL缓存的功能预热;对于RTL全细节仿真,HyWarm利用CPU分簇和LJF调度缩短最大完成时间.HyWarm相较于现有最好的RTL采样仿真方法,在与基准方法准确率相似的前提下,将仿真完成时间缩短了53%.When developing high-performance processors,accurate and fast performance estimation is the basis for design decisions and parameter exploration.Prior work accelerates processor RTL emulation through workload sampling and architectural checkpoints for RTL,which makes it possible to estimate the performance of benchmarks such as SPECCPU running on complex high-performance processors within a few days.However,waiting a few days for performance results is still too long for architecture iteration,and there is still room for further shortening the performance measurement cycle.During RTL emulation of processors,the warm up phase consumes a significant amount of time.As a solution to expedite the warm up phase during performance evaluation,the HyWarm framework is developed.HyWarm analyzes the warm up demand of workloads with the micro-architectural simulator,and adaptively customizes the warm up scheme for each workload.For workloads with high warm up demand on caches,HyWarm performs functional warm up through the caches’bus protocol on RTL.For detailed emulation part,HyWarm utilizes CPU clustering and LJF scheduling to reduce the maximum completion time.Compared with the best existing sampling-based RTL emulation method,HyWarm reduces the emulation completion time by 53%under the premise of similar accuracy to the baseline method.
关 键 词:高性能处理器 芯片设计 敏捷开发 负载采样 功能预热
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.31