DSP体系结构发展综述  被引量:1

Overview of DSP architecture development

在线阅读下载全文

作  者:宋文娜 徐东君 陈亮[1] SONG Wenna;XU Dongjun;CHEN Liang(Institute of Automation,Chinese Academy of Sciences,Beijing 100190,China)

机构地区:[1]中国科学院自动化研究所,北京100190

出  处:《微电子学与计算机》2023年第4期1-7,共7页Microelectronics & Computer

摘  要:数字信号处理器(Digital Signal Processor,DSP)是一种用于数字信号处理的专用微处理器,在通信、自动化、雷达、航空航天等领域具有重要应用价值.本文系统阐述了DSP体系结构的发展过程和现状,介绍了主要生产厂商的DSP产品及其性能;总结了DSP芯片的主要结构特点;分析了现有DSP体系结构设计中提升数据级和指令级并行性的主要技术,包括哈佛结构、硬件乘法器、SIMD、VLIW和超标量等.结合新时代DSP应用需求,本文提出了DSP体系结构研究的三个发展方向:(1)通过增加数据和指令并行性,向超高性能DSP发展,提升矢量、标量并行能力,支持张量计算,集成面向神经网络算子的专用控制通路和功能单元,提升AI计算处理能力;(2)从指令系统入手,将变长指令集与超标量技术结合,在实现指令并行的同时,结合可适应神经网络算法扩展的计算流控制指令,提升AI算法映射能力,同时降低代码密度,减小存储压力和取指带宽,降低成本,提升边缘智能实时处理应用能力;(3)兼容面向稀疏神经网络的压缩和并发访问的分布式存储结构,提升边缘智能片上部署能力和网络层多通道并行计算能力.Digital signal processor(DSP)is a special microprocessor for digital signal processing,which has important application value in communication,automation,radar,aerospace and other fields.This paper systematically expounds the development process and current situation of DSP architecture,and introduces the DSP products and performance of the main manufacturers;Moreover,the main structure characteristics of DSP chip are summarized;This paper also analyzes the main techniques for improving data level and instruction level parallelism in the existing DSP architecture design,including Harvard architecture,hardware multiplier,SIMD,VLIW and superscalar.Combined with the application requirements of DSP in the new era,this paper proposes three development directions of DSP architecture research:(1)Increasing the parallelism of data and instructions could move DSP toward ultra-high performance.Improving the vector and scalar parallel ability,supporting tensor calculation,integrating special control channels and functional units for neural network operators can promote the AI computing processing ability.(2)Starting from the instruction system,combining a variable-length instruction set with superscalar technology to realize instruction parallelism,and at the same time,the computational flow control instruction that can adapt to the expansion of neural network algorithm is combined to improve the mapping ability of AI algorithm,and meanwhile reducing the code density,the storage pressure and the fetch bandwidth,minimizing the cost,and improving the edge intelligent real-time processing application ability;(3)The compatible distributed storage structure of compression and concurrent access for sparse neural networks can enhance the edge intelligent on-chip deployment capability and the network layer multi-channel parallel computing capability.

关 键 词:哈佛结构 硬件乘法器 SIMD结构 VLIW技术 超标量 

分 类 号:TP332[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象