检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王箫音[1] 佟冬[1] 党向磊[1] 陆俊林[1] 程旭[1]
机构地区:[1]北京大学微处理器研究开发中心,北京100871
出 处:《北京大学学报(自然科学版)》2011年第1期35-44,共10页Acta Scientiarum Naturalium Universitatis Pekinensis
基 金:863计划(2006AA010202)资助
摘 要:面向按序执行处理器开展预执行机制的设计空间探索,并对预执行机制的优化效果随Cache容量和访存延时的变化趋势进行了量化分析。实验结果表明,对于按序执行处理器,保存并复用预执行期间的有效结果和在预执行访存指令之间进行数据传递都能够有效地提升处理器性能,前者还能够有效地降低能耗开销。将两者相结合使用,在平均情况下将基础处理器的性能提升24.07%,而能耗仅增加4.93%。进一步发现,在Cache容量较大的情况下,预执行仍然能够带来较大幅度的性能提升。并且,随着访存延时的增加,预执行在提高按序执行处理器性能和能效性方面的优势都将更加显著。The authors explore the design space of in-order executing ahead processors, and conduct sensitivity analysis of the executing ahead mechanism to the cache hierarchy and memory latency. It is demonstrated that reusing the pre-executed results is highly effective in improving performance and reducing energy consumption. The results also show that propagating valid data values between stores and dependent loads with a small store cache increases performance significantly. An in-order executing ahead processor with a 32-entry store cache and a 128-entry FIFO for preserving and reusing results increases performance by 24.07% over the baseline processor, with an energy overhead of 4.93%. Furthermore, it is revealed that executing ahead is necessary for hiding memory access latencies even with a very large cache hierarchy. With increasing memory latency, the performance and energy-efficiency benefits provided by executing ahead are more significant.
分 类 号:TP332[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.145