检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:陈博伦 何卫锋[1] CHEN Bo-lun;HE Wei-feng(Department of Microelectronics and Nanoscience,Shanghai Jiaotong University,Shanghai 200240)
出 处:《现代计算机》2020年第12期68-72,共5页Modern Computer
摘 要:定量分析一维FFT的分解基数选取、旋转因子计算、倒序排列等各个阶段在GPU上并行执行时的特征,并提出以蝶形算子访存跨度为依据的共享内存访问机制,解决共享内存访问效率过低的问题。在此基础上,提出批量列处理机制,解决二维FFT列变换时的全局内存访问不连续问题。实验结果表明,在图像尺寸从1024×1024到4096×4096像素的情况下,加速后的二维FFT程序的性能较CUFFT库函数提升5%-13%。Quantitatively analyze the characteristics of the 1D FFT's radix selection,rotation factor calculation,and reverse order execution on the GPU in parallel,and propose a shared memory access mechanism based on butterfly operator fetch span to solve the problem of inefficient shared memory access.On this basis,a batch column processing mechanism is proposed to solve the problem of discontinuous global memo⁃ry access during 2D FFT’s column transformation.The experimental results show that the performance of the 2D FFT program designed is improved by 5%-13%compared with the CUFFT library function when the image size is from 1024×1024 to 4096×4096 pixels.
分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7