检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:茹妞妞 于晋伟 杨卫华 卞玮 RU Niuniu;YU Jinwei;YANG Weihua;BIAN Wei(College of Mathematics,Taiyuan University of Technology,Taiyuan 030000,China)
出 处:《计算机工程》2022年第9期248-253,共6页Computer Engineering
基 金:国家自然科学基金(11671296)。
摘 要:视频插值是利用视频相邻帧的图像信息合成中间帧,可直接应用于慢动作视频回放、高频视频合成、动画制作等领域。现有基于深度体素流的视频插值模型存在合成精度低、参数量大的问题,限制其在移动端的部署应用。提出一种压缩驱动的精化深度体素流插值模型。通过预训练深度体素流模型提高视频的插值质量并确定高精度参数,利用稀疏压缩技术裁剪卷积通道数,以减少参数量并得到粗体素流,同时将输入视频帧、粗体素流和粗中间帧作为精体素流网络的输入,获得精体素流。在此基础上,通过三线性插值方法计算得到精中间帧,以增强模型对边缘信息的捕获能力,从而提高中间帧质量。在Vimeo 90K和UCF101数据集上的实验结果表明,相比DVF、SepConv、CDFI等模型,该模型的峰值信噪比和结构相似性分别平均提高1.59 dB和0.015,在保证参数量增幅较小的前提下,能够有效优化视频合成效果。Video interpolation refers to the synthesis of intermediate frames using the image information of adjacent frames in a video,which can be directly applied to slow motion video playback,high-frequency video synthesis,animation production,and other applications.The existing video interpolation model based on Deep Voxel Flow(DVF)has issues such as low accuracy and many parameters,which limit its deployment and application in mobile terminals.This study proposes a refinement of the DVF interpolation model based on compression.By pre-training the DVF model,the interpolation quality of the video can be improved and high-precision parameters can be determined.The number of convolution channels in the model is reduced using sparse compression technology to reduce the number of parameters and obtain the bold voxel flow.Furthermore,the input video frame,bold voxel flow,and rough intermediate frame are taken as input for the refined voxel flow network.On this basis,the fine intermediate frame is calculated by trilinear interpolation method to enhance the ability of the model to capture edge information and thereby improve the accuracy of the intermediate frame.The experimental test results obtained using Vimeo90K and UCF101 datasets show that compared with the DVF,SepConv,CDFI,and other models,the proposed model has a higher peak signal-to-noise ratio and structural similarity 1.59 dB and 0.015,respectively.Thus,the proposed model effectively optimizes the video synthesis effect on the premise of ensuring a small increase in parameter volume.
关 键 词:视频插值 预训练模型 参数压缩 卷积神经网络 精化深度体素流模型
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.138.137.25