检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:裴航 王磊 王威[1,2] 张书钦 PEI Hang;WANG Lei;WANG Wei;ZHANG Shu-qin(School of Computer Science,Zhongyuan University of Technology,Zhengzhou 451191,China;Research Institute of Frontier Information Technology,Zhongyuan University of Technology,Zhengzhou 451191,China)
机构地区:[1]中原工学院计算机学院,河南郑州451191 [2]中原工学院前沿信息技术研究院,河南郑州451191
出 处:《计算机技术与发展》2021年第10期81-86,共6页Computer Technology and Development
基 金:河南省高校重点科研项目(18B520044);河南省科技攻关项目(182102210526)。
摘 要:H.264解码器在申威平台移植后遇到解码效率低,视频播放不流畅等问题。为提升视频解码性能,满足国产申威平台用户的多媒体需求,首先对FFmpeg开源编解码库中H.264解码器进行了详细分析,使用性能分析工具找到视频解码的热点函数。然后充分利用申威处理器的向量扩展部件,对解码器运动补偿、DCT反变换等关键模块代码使用手工嵌入式汇编进行向量指令替换来缩短指令周期,实现向量化并行。最后对环路滤波代码中不能直接向量化的循环通过数组重组等方式满足向量化分析,然后进行向量化计算,更深层次挖掘多媒体并行能力,从而提升多媒体程序运行速度。实验结果表明,向量化后的视频解码性能最高提升了35.3%,释放了CPU资源,解决了视频播放不流畅的问题,有效推动了申威处理器市场化发展。The H.264 decoder encountered problems such as low decoding efficiency and unsmooth video playback after being transplanted on the Shenwei platform.To promote the video decoding performance and meet the multimedia needs of domestic Shenwei platform users,firstly the H.264 decoder in the FFmpeg open source codec library is analyzed in detail,and the performance analysis tool is used to find the hot functions of video decoding.Then making full use of the vector expansion components of the Shenwei processor,we use manual embedded assembly for vector instruction replacement for key module codes such as decoder motion compensation and DCT inverse transformation to shorten the instruction cycle and achieve vectorization parallelism.Finally,in the loop filter code that cannot be directly vectorized,the vectorized analysis is satisfied by means of array reorganization,and then vectorized calculation is carried out to dig deeper into the multimedia parallel capabilities,thereby improving the running speed of the multimedia program.The experiment shows that the video decoding performance after vectorization is improved by up to 35.3%,which frees up CPU resources,solves the problem of unsmooth video playback,and effectively promotes the market development of Shenwei processors.
关 键 词:H.264解码器 FFmpeg编解码库 申威处理器 单指令多数据流 并行计算
分 类 号:TP302[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.40