定点FFT的DSP向量混洗并行基4算法  

Vector Shuffling Based Parallel Radix-4 FFT Algorithm for Fixed Point Data on Vector DSP

在线阅读下载全文

作  者:王书盈 胡勇华[1] 张鑫 陆浩松 WANG Shuying;HU Yonghua;ZHANG Xin;LU Haosong(School of Computer Science and Engineering,Hunan University of Science and Technology,Xiangtan 411201,China)

机构地区:[1]湖南科技大学计算机科学与工程学院,湖南湘潭411201

出  处:《湖南科技大学学报(自然科学版)》2024年第2期75-86,共12页Journal of Hunan University of Science And Technology:Natural Science Edition

基  金:湖南省自然科学基金资助项目(2023JJ50019);湖南省教育厅科研项目资助(20B242,19A169)。

摘  要:基于定点数据的快速傅里叶变换(Fast Fourier Transform, FFT)算法能在合理的精度范围内降低对硬件的要求,计算速度更快.文章面向高性能向量数字信号处理器(Digital Signal Processor, DSP)的硬件特征,构建基4复数FFT算法的高效指令级并行处理模型.该模型充分考虑基4方法下的复数FFT计算过程和蝶形组集合的特征,将SIMD计算、向量混洗、索引DMA等技术与复数FFT的基4变换过程充分融合,有效控制计算过程中存储器和片内缓存之间的数据块搬移需求,提升SIMD计算单元的利用率.在基于自主YHFT-M7002处理器的FT-M7002DSK平台上进行试验研究,验证算法的有效性.试验结果表明:与CCS模拟所得TI的相应TMS320C6678库函数性能相比,所提优化算法的平均加速比达到TI库函数的4.79倍.The Fast Fourier Transform(FFT)algorithm of fixed-point data can reduce the hardware requirements within a reasonable accuracy range but obtain faster computing speed.Based on the hardware characteristics of high performance vector Digital Signal Processors(DSPs),this paper constructs an efficient instruction level parallel processing algorithm of Radix-4 complex FFT algorithm.This algorithm considers the calculation process of the Radix-4 complex FFT algorithm and the characteristics of butterfly units,and it fully integrates SIMD calculation,vector shuffling,indexing DMA and other techniques with the transformation process of Radix-4 complex FFT.This algorithm effectively controls the data block movement between memory and in-chip cache during the computing process and improves the utilization rate of SIMD processing unit.In this paper,an experimental study is conducted on the FT-M7002DSK platform of the YHFT-M7002 processor,which has independent intellectual property rights.Result shows that,compared with the performance obtained by CCS simulator for the corresponding TMS320C6678 library function,the average performance of our algorithm is 3.79 times faster than the former.

关 键 词:基4复数FFT SIMD技术 向量DSP 向量混洗 索引DMA 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象