基于可变形注意力的时空特征融合超分辨率方法

A Super Resolution Method for Spatiotemporal Feature Fusion Based on Deformable Attention

作　　者：张墨华[1] 张钰超刘霁 ZHANG Mohua;ZHANG Yuchao;LIU Ji(School of Computer Information and Engineering,Henan University of Economics and Law,Zhengzhou 450000,China)

机构地区：[1]河南财经政法大学计算机信息与工程学院,河南郑州450000

出　　处：《软件导刊》2024年第12期234-240,共7页Software Guide

基　　金：河南省科技攻关项目(222102210326)。

摘　　要：视频超分辨率技术旨在将低分辨率视频转为高分辨率视频。现有基于可变形卷积的特征对齐方式受限于感受野大小,只能在指定的空间位置进行卷积空间的局部偏移,当帧间大规模运动时效果并不好。为此,提出一种基于可变形注意力空间变换的对齐方式在整个特征图进行采样。首先,通过偏移将采样点聚焦于当前处理位置相关的任意位置;其次,模型在全局范围使用递归结构传播融合特征,局部范围利用Transformer提取特征与对齐帧;再次,将对齐后的特征输入具有通道注意力的时空特征融合模块中来补充重建信息;最后,将融合模块的输出随递归网络进行双向传播,以补充相邻帧的时域特征,并通过亚像素卷积4倍上采样得到高分辨率视频。实验表明,该网络以BasicVSR为基线,在REDS4、Vid4数据集上的PSNR指标分别提升0.69 dB和0.43 dB。Video super-resolution technology aims to convert low resolution videos into high-resolution videos.The existing feature alignment methods based on deformable convolution are limited by the receptive field size,and can only perform local offsets in the convolution space at specified spatial positions.The effect is not good when there is large-scale motion between frames.Therefore,a alignment method based on deformable attention space transformation is proposed to sample the entire feature map.Firstly,by offsetting,the sampling points are focused on any position related to the current processing location;Secondly,the model uses recursive structures to propagate fused features globally,and Transformer to extract features and align frames locally;Again,input the aligned features into a spatiotemporal feature fusion module with channel attention to supplement the reconstruction information;Finally,the output of the fusion module is propagated bidirectionally with a recursive network to supplement the temporal features of adjacent frames,and high-resolution video is obtained through sub-pixel convolution with 4x upsampling.The experiment shows that the network improves the PSNR index by 0.69 dB and 0.43 dB on the REDS4 and Vid4 datasets,respectively,with BasicVSR as the baseline.

关键词：循环神经网络视频超分 TRANSFORMER 注意力机制深度学习

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于可变形注意力的时空特征融合超分辨率方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于可变形注意力的时空特征融合超分辨率方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索