基于申威421的视频解码的向量化并行  

Vectorization Parallelism of Video Decoding Based on Shenwei 421

在线阅读下载全文

作  者:裴航 王磊 王威[1,2] 张书钦 PEI Hang;WANG Lei;WANG Wei;ZHANG Shu-qin(School of Computer Science,Zhongyuan University of Technology,Zhengzhou 451191,China;Research Institute of Frontier Information Technology,Zhongyuan University of Technology,Zhengzhou 451191,China)

机构地区:[1]中原工学院计算机学院,河南郑州451191 [2]中原工学院前沿信息技术研究院,河南郑州451191

出  处:《计算机技术与发展》2021年第10期81-86,共6页Computer Technology and Development

基  金:河南省高校重点科研项目(18B520044);河南省科技攻关项目(182102210526)。

摘  要:H.264解码器在申威平台移植后遇到解码效率低,视频播放不流畅等问题。为提升视频解码性能,满足国产申威平台用户的多媒体需求,首先对FFmpeg开源编解码库中H.264解码器进行了详细分析,使用性能分析工具找到视频解码的热点函数。然后充分利用申威处理器的向量扩展部件,对解码器运动补偿、DCT反变换等关键模块代码使用手工嵌入式汇编进行向量指令替换来缩短指令周期,实现向量化并行。最后对环路滤波代码中不能直接向量化的循环通过数组重组等方式满足向量化分析,然后进行向量化计算,更深层次挖掘多媒体并行能力,从而提升多媒体程序运行速度。实验结果表明,向量化后的视频解码性能最高提升了35.3%,释放了CPU资源,解决了视频播放不流畅的问题,有效推动了申威处理器市场化发展。The H.264 decoder encountered problems such as low decoding efficiency and unsmooth video playback after being transplanted on the Shenwei platform.To promote the video decoding performance and meet the multimedia needs of domestic Shenwei platform users,firstly the H.264 decoder in the FFmpeg open source codec library is analyzed in detail,and the performance analysis tool is used to find the hot functions of video decoding.Then making full use of the vector expansion components of the Shenwei processor,we use manual embedded assembly for vector instruction replacement for key module codes such as decoder motion compensation and DCT inverse transformation to shorten the instruction cycle and achieve vectorization parallelism.Finally,in the loop filter code that cannot be directly vectorized,the vectorized analysis is satisfied by means of array reorganization,and then vectorized calculation is carried out to dig deeper into the multimedia parallel capabilities,thereby improving the running speed of the multimedia program.The experiment shows that the video decoding performance after vectorization is improved by up to 35.3%,which frees up CPU resources,solves the problem of unsmooth video playback,and effectively promotes the market development of Shenwei processors.

关 键 词:H.264解码器 FFmpeg编解码库 申威处理器 单指令多数据流 并行计算 

分 类 号:TP302[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象