HEVC帧内率失真优化预测模式的并行流水线硬件设计  被引量:3

Parallel Pipeline Hardware Design of Intra Rate-Distortion Optimization Prediction Mode in HEVC

在线阅读下载全文

作  者:林志坚 丁永强 杨秀芝[1] 吴林煌[1] LIN Zhijian;DING Yongqiang;YANG Xiuzhi;WU Linhuang(College of Physics and Information Engineering,Fuzhou University,Fuzhou 350108,Fujian,China)

机构地区:[1]福州大学物理与信息工程学院,福建福州350108

出  处:《华南理工大学学报(自然科学版)》2023年第5期95-103,共9页Journal of South China University of Technology(Natural Science Edition)

基  金:国家自然科学基金面上项目(61871132,62171135);福建省高等学校科技创新团队项目(产业化专项,500190)。

摘  要:近年来,随着人们对视频数据需求的不断增加,视频的分辨率和帧率也在不断地提高,而实时视频序列的压缩编码速度往往受到帧率和分辨率的影响,分辨率和帧率越大,编码所需要的时间越长。为了实现更高分辨率和更高帧率的视频序列实时压缩编码,文中设计了一种新的帧内率失真优化预测模式的并行流水线硬件架构,该架构支持最大64×64编码树单元的帧内预测编码。首先设计了9路预测模式并行方案;然后,按照Z型扫描顺序实现以4×4块为基本处理单元的流水线硬件架构,并复用32×32预测单元的预测数据,用以代替64×64预测单元的预测数据,减少运算量;最后,基于该流水线架构,提出了一种新的哈达玛变换电路,用以实现高效的流水线处理。实验结果表明:在Altera Arria 10系列的现场可编程门阵列上,该9路模式并行架构仅占用75 kb的查找表和55 kb的寄存器资源,主频可以达到207 MHz,完成一个64×64编码树单元的预测仅需要4096个时钟周期,最大能够支持1080 P分辨率99 f/s全I帧的实时编码;与已有设计方案相比,文中方案能够用更小的电路面积实现更高帧率的1080 P实时视频编码。In recent years,the resolution and frame rate of video have been continuously improved to meet people’s increasing demand for video data.However,the compression encoding speed of real-time video sequence is often restricted by frame rate and resolution.The higher the frame rate and resolution are,the longer the encoding time will be.In order to achieve real-time compression encode for video sequences with higher resolution and frame rate,this paper designed a new parallel pipeline hardware architecture of intra rate-distortion optimization prediction mode,which supports intra prediction coding of up to 64×64 coding tree unit.Firstly,a parallel scheme with 9-way prediction mode was designed.Secondly,a pipeline hardware architecture was implemented based on a 4×4 block as the basic processing unit in a Z-shaped scanning order,and the prediction data of 32×32 prediction units were reused to replace the prediction data of 64×64 prediction units so as to reduce the amount of calculation.Lastly,a new Hadamard transform circuit was proposed based on this pipelined architecture for efficient pipelined processing.The experimental results show that:on the Altera Arria 10 series field programmable gate array,the 9-way mode parallel architecture only occupies 75 kb look up table and 55 kb register resources,the main frequency can reach 207 MHz,and it only takes 4096 clocks cycles to complete a 64×64 coding tree unit prediction and can support real-time encoding of 1080 P resolution 99 f/s full I-frame at most.Compared with the existing design scheme,the scheme designed in this paper can realize higher frame rate 1080 P real time video encoding with smaller circuit area.

关 键 词:帧内预测 现场可编程门阵列 模式并行 高效视频编码 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象