机器学习高性能SIMT处理器的设计与实现  

Design and implementation of high performance SIMT processor for machine learning

在线阅读下载全文

作  者:张宏伟 李涛 冯臻夫 贾蕊 ZHANG Hong-wei;LI Tao;FENG Zhen-fu;JIA Rui(School of Electronic Engineering, Xi′an University of Posts & Telecommunications,Xi’an 710121,China;School of Computing, Xi′an University of Posts & Telecommunications,Xi'an 710121,China)

机构地区:[1]西安邮电大学电子工程学院,陕西西安710121 [2]西安邮电大学计算机学院,陕西西安710121

出  处:《微电子学与计算机》2019年第9期79-83,共5页Microelectronics & Computer

基  金:国家自然科学基金重点资助项目(61136002);陕西教育厅科研项目(2050205)

摘  要:针对机器学习中出现的大数据量运算的问题,自主研发了一款高性能SIMT(Single Instruction Multiple Threads)架构处理器.采用特殊的四级流水线结构,通过可综合的Verilog HDL语言对电路进行描述,完成了数据的多线程并行运算.在XiLinx公司VirtexUltraSacle系列的xcvu440-flga2892-2-e FPGA上搭建仿真验证平台对整体电路进行功能验证,结果表明,本设计电路满足多线程并行处理机制.采用SYNOPSYS公司Design-Compile在SMIC 65nm CMOS工艺标准单元库进行综合验证,系统时钟最高工作频率为370 MHz,系统最大功耗为4.251 mw.Aiming at the problem of large data volume computing in machine learning, a high-performance SIMT(Single Instruction Multiple Threads) architecture processor was developed. Using a special four-stage pipeline structure, the circuit is described in a synthesizable Verilog HDL language, and multi-thread parallel computing of data is completed. The simulation verification platform was built on the xcvu440-flga2892-2-e FPGA of XiLinxVirtexUltraSacle series to verify the function of the whole circuit. The results show that the design circuit satisfies the multi-thread parallel processing mechanism. The SYNOPSYS Design-Compile is used for comprehensive verification in the SMIC 65 nm CMOS process standard cell library. The maximum operating frequency of the system clock is 370 MHz, and the maximum power consumption of the system is 4.251 mw.

关 键 词:SIMT 流水线 多线程 并行运算 FPGA 

分 类 号:TP338.6[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象