检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张桢 梁军 贾海鹏[2] 张云泉[2] 李青 ZHANG Zhen;LIANG Jun;JIA Haipeng;ZHANG Yunquan;LI Qing(Beijing Key Laboratory of Information Service Engineering,Beijing Union University,Beijing 100101,China;State Key Laboratory of Computer Architecture,Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190,China)
机构地区:[1]北京联合大学北京市信息服务工程重点实验室,北京100101 [2]中国科学院计算技术研究所计算机体系结构国家重点实验室,北京100190
出 处:《计算机工程》2023年第4期159-165,173,共8页Computer Engineering
基 金:国家自然科学基金(61972376);北京联合大学科研项目(ZK50202002)。
摘 要:RISC-V处理器的广泛应用使得FFmpeg多媒体算法库在RISC-V平台上的高性能实现日益重要。提出一种基于RISC-V架构的系列优化策略,针对开源音视频多媒体FFmpeg算法库中不同特征和计算密度的算法,利用RISC-V指令集的扩展性对算法库中某些耗时的算法进行指令加速和并行优化。在深入研究RISC-V开源架构的基础上,构建一个基于RISC-V开源架构的高性能FFmpeg算法库。针对不连续访存类算法、数据依赖类算法、数据快速转换类算法,从向量单元配置、向量化访存、汇编优化、指令流水优化4个方面出发,大幅提升FFmpeg算法库在RISC-V处理器上的性能。实验结果表明,采用以上优化策略后的FFmpeg算法库在基于RISC-V架构的XT-910芯片上的性能得到明显提升,其中的不连续访存类算法、数据依赖类算法、数据快速转换类算法的加速比分别为8.20、3.67、3.62。The widespread application of RISC-V processors has made the high-performance implementation of FFmpeg multimedia algorithm library on the RISC-V platform increasingly important.This study proposes a series of RISC-V architecture-based optimization strategies aimed at algorithms with different characteristics and computational densities in the open source audio and video multimedia FFmpeg algorithm library and uses the extensibility of the RISC-V instruction set to accelerate and optimize the instructions of few time-consuming algorithms in the library.Based on an in-depth study of the RISC-V open source architecture,a high-performance FFmpeg algorithm library based on RISC-V is built.The performance of the FFmpeg algorithm library on RISC-V processors is significantly improved with the aim of discontinuous memory retrieval,data dependency,and fast data conversion algorithms in four aspects:vector unit configuration,vectorized memory access,assembly optimization,and instruction pipeline optimization.The experimental results show that adoption of the aforementioned optimization strategy significantly improved the performance of the FFmpeg algorithm library on the XT-910 chip based on RISC-V architecture,and the speedup ratios of the discontinuous memory access,data dependency,and data fast conversion algorithms are 8.20,3.67,and 3.62,respectively.
关 键 词:开源指令集架构 FFmpeg多媒体算法库 向量化访存 汇编优化 指令流水优化
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.143.213.242