基于压缩与精化深度体素流模型的视频插值

Video Interpolation Based on Compression and Refined Deep Voxel Flow Model

作　　者：茹妞妞于晋伟杨卫华卞玮 RU Niuniu;YU Jinwei;YANG Weihua;BIAN Wei(College of Mathematics,Taiyuan University of Technology,Taiyuan 030000,China)

机构地区：[1]太原理工大学数学学院,太原030000

出　　处：《计算机工程》2022年第9期248-253,共6页Computer Engineering

基　　金：国家自然科学基金(11671296)。

摘　　要：视频插值是利用视频相邻帧的图像信息合成中间帧,可直接应用于慢动作视频回放、高频视频合成、动画制作等领域。现有基于深度体素流的视频插值模型存在合成精度低、参数量大的问题,限制其在移动端的部署应用。提出一种压缩驱动的精化深度体素流插值模型。通过预训练深度体素流模型提高视频的插值质量并确定高精度参数,利用稀疏压缩技术裁剪卷积通道数,以减少参数量并得到粗体素流,同时将输入视频帧、粗体素流和粗中间帧作为精体素流网络的输入,获得精体素流。在此基础上,通过三线性插值方法计算得到精中间帧,以增强模型对边缘信息的捕获能力,从而提高中间帧质量。在Vimeo 90K和UCF101数据集上的实验结果表明,相比DVF、SepConv、CDFI等模型,该模型的峰值信噪比和结构相似性分别平均提高1.59 dB和0.015,在保证参数量增幅较小的前提下,能够有效优化视频合成效果。Video interpolation refers to the synthesis of intermediate frames using the image information of adjacent frames in a video,which can be directly applied to slow motion video playback,high-frequency video synthesis,animation production,and other applications.The existing video interpolation model based on Deep Voxel Flow(DVF)has issues such as low accuracy and many parameters,which limit its deployment and application in mobile terminals.This study proposes a refinement of the DVF interpolation model based on compression.By pre-training the DVF model,the interpolation quality of the video can be improved and high-precision parameters can be determined.The number of convolution channels in the model is reduced using sparse compression technology to reduce the number of parameters and obtain the bold voxel flow.Furthermore,the input video frame,bold voxel flow,and rough intermediate frame are taken as input for the refined voxel flow network.On this basis,the fine intermediate frame is calculated by trilinear interpolation method to enhance the ability of the model to capture edge information and thereby improve the accuracy of the intermediate frame.The experimental test results obtained using Vimeo90K and UCF101 datasets show that compared with the DVF,SepConv,CDFI,and other models,the proposed model has a higher peak signal-to-noise ratio and structural similarity 1.59 dB and 0.015,respectively.Thus,the proposed model effectively optimizes the video synthesis effect on the premise of ensuring a small increase in parameter volume.

关键词：视频插值预训练模型参数压缩卷积神经网络精化深度体素流模型

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于压缩与精化深度体素流模型的视频插值

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于压缩与精化深度体素流模型的视频插值

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索