面向按序执行处理器的预执行机制设计空间探索(英文)  

A Comprehensive Study of Executing ahead Mechanism for In-Order Microprocessors

在线阅读下载全文

作  者:王箫音[1] 佟冬[1] 党向磊[1] 陆俊林[1] 程旭[1] 

机构地区:[1]北京大学微处理器研究开发中心,北京100871

出  处:《北京大学学报(自然科学版)》2011年第1期35-44,共10页Acta Scientiarum Naturalium Universitatis Pekinensis

基  金:863计划(2006AA010202)资助

摘  要:面向按序执行处理器开展预执行机制的设计空间探索,并对预执行机制的优化效果随Cache容量和访存延时的变化趋势进行了量化分析。实验结果表明,对于按序执行处理器,保存并复用预执行期间的有效结果和在预执行访存指令之间进行数据传递都能够有效地提升处理器性能,前者还能够有效地降低能耗开销。将两者相结合使用,在平均情况下将基础处理器的性能提升24.07%,而能耗仅增加4.93%。进一步发现,在Cache容量较大的情况下,预执行仍然能够带来较大幅度的性能提升。并且,随着访存延时的增加,预执行在提高按序执行处理器性能和能效性方面的优势都将更加显著。The authors explore the design space of in-order executing ahead processors, and conduct sensitivity analysis of the executing ahead mechanism to the cache hierarchy and memory latency. It is demonstrated that reusing the pre-executed results is highly effective in improving performance and reducing energy consumption. The results also show that propagating valid data values between stores and dependent loads with a small store cache increases performance significantly. An in-order executing ahead processor with a 32-entry store cache and a 128-entry FIFO for preserving and reusing results increases performance by 24.07% over the baseline processor, with an energy overhead of 4.93%. Furthermore, it is revealed that executing ahead is necessary for hiding memory access latencies even with a very large cache hierarchy. With increasing memory latency, the performance and energy-efficiency benefits provided by executing ahead are more significant.

关 键 词:按序执行处理器 预执行 访存延时包容 

分 类 号:TP332[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象