检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]东华大学信息科学与技术学院,上海201620
出 处:《计算机科学与探索》2016年第9期1221-1228,共8页Journal of Frontiers of Computer Science and Technology
基 金:国家自然科学基金No.41401486;上海市自然科学基金No.14ZR1400500~~
摘 要:针对高效视频编码(higheffici ency video coding,HEVC)的去块滤波,现有文献并没有深入研究其算法层和平台层之间的跨层并行实现机制.基于算法层的有向无环图集(directed acyclic graph set,DAGS)和平台层的通用并行计算架构(compute unified device architecture,CUDA),针对HEVC去块滤波提出了一种跨层并行解码方案.所提方案通过分离图像帧的独立像素区域来减少对缓存的访问,并且降低了HEVC滤波过程中的时序依赖性,便于多核平台的并行处理.通过实验比较“串行”、“DAGS+多核CPU”、“DAGS+GPU”3种不同的HEVC去块滤波方案,结果表明,所提“DAGS+GPU”跨层并行滤波方案平均取得了11一24倍的解码加速比,在保证率失真性能相当的情况下显著减少了解码时间.For the deblocking filter of high efficiency video coding(HEVC), current literatures lack the in-depth research on the cross-layer parallel implementation between algorithm layer and platform layer. Based on the directed acyclic graph set(DAGS) at algorithm layer and the compute unified device architecture(CUDA) at platform layer,this paper proposes a cross-layer parallel decoding scheme for HEVC deblocking filter. The proposed scheme exploits the independent pixel regions to reduce cache accesses, and weakens the sequential dependence of filtering process to facilitate the parallel optimization. By evaluating three implementation schemes of HEVC deblocking filter:'serial','DAGS+multi-core CPU'and'DAGS+GPU', the experimental results demonstrate that the proposed'DAGS+GPU'scheme can achieve the speedup as high as 11~24 times, and thus significantly save the decoding time while maintaining similar rate-distortion performance.
关 键 词:去块滤波 有向无环图集 并行处理 多核平台 通用并行计算架构
分 类 号:TN919.8[电子电信—通信与信息系统]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.44