用于视频压缩感知的特征域优化启发及多假设交叉注意力重构神经网络  

Feature-Space Optimization-Inspired and Multi-Hypothesis Cross-Attention Reconstruction Neural Network for Video Compressive Sens

在线阅读下载全文

作  者:杨春玲[1] 陈文俊 刘嘉惠 YANG Chunling;CHEN Wenjun;LIU Jiahui(School of Electronic and Information Engineering,South China University of Technology,Guangzhou 510640,Guangdong,China)

机构地区:[1]华南理工大学电子与信息学院,广东广州510640

出  处:《华南理工大学学报(自然科学版)》2024年第10期9-21,共13页Journal of South China University of Technology(Natural Science Edition)

基  金:广东省自然科学基金资助项目(2019A1515011949)。

摘  要:现有视频压缩感知重建网络通常利用光流网络实现像素域运动估计与运动补偿。然而在重建过程中,光流网络的输入为质量较差的初始估计帧,导致获得的光流不准确,基于光流的像素域对齐与融合操作会造成噪声的累积,导致视频重建帧存在明显的人工效应,影响重建质量。基于特征域多通道信息对干扰噪声具有较强的鲁棒性,文中将特征域优化思想应用于视频压缩感知重构神经网络的设计中,提出了特征域优化启发及光流引导的多假设交叉注意力重构神经网络(FOFMCNet)。为避免光流中的噪声在图像变形时破坏图像结构的问题,文中在特征域设计了光流指导的多假设运动估计模块与基于交叉注意力的运动补偿模块,以实现特征域的帧间运动估计与运动补偿,从而更为充分地利用帧间相关性辅助非关键帧重构。为了在特征优化过程中加强有效信息的复用,提升网络学习能力并缓解梯度爆炸问题,文中设计了特征域优化启发U型网络(FOUNet),并作为FOFMCNet的子网络,通过多个FOUNet的级联,FOFMCNet在特征域实现非关键帧的优化与重建。实验结果表明,文中所提算法在经典低分辨率数据集(UCF-101和QCIF)和新的高分辨率数据集(REDS4)上的重构结果均优于现有的视频压缩感知算法。The existing video compressive sensing reconstruction network usually uses the optical flow network to achieve pixel domain motion estimation and motion compensation.However,during the reconstruction process,the input of the optical flow network is the estimated frame with poor quality,resulting in inaccurate optical flow.The optical flow-based pixel domain alignment and fusion operation will cause noise accumulation,lead to obvious artificial effects in video reconstruction frames and affect the reconstruction quality.Based on the fact that multichannel information in the feature space has strong robustness to interference noise,this paper applied the idea of feature space optimization to the design of the video compressive sensing reconstruction neural network,and proposed a feature-space optimization-inspired and flow-guided multi-hypothesis cross-attention network(FOFMCNet).To avoid the image structure destruction caused by the noise in the optical flow when warping the image,the study designed multi-hypothesis motion estimation module guided by optical flow and the motion compensation module based on cross-attention to realize the motion estimation and motion compensation of interframe in feature space,so as to make full use of inter-frame correlation to assist non-key frame reconstruction.In order to strengthen the reuse of effective information in the process of feature optimization,improve the learning ability of the network and alleviate the problem of gradient explosion,this paper designed a feature-space optimization-inspired u-shape network(FOUNet)as a sub-network of FOFMCNet.Through the cascade of multiple FOUNets,the FOFMCNet realizes the optimization and reconstruction of non-key frames in the feature space.Experimental results show that the reconstruction results of the proposed algorithm are obviously better than those of the existing video compression sensing algorithms on the classical low-resolution dataset(UCF-101 and QCIF)and new high-resolution dataset(REDS4).

关 键 词:视频压缩感知 特征域优化 卷积神经网络 注意力机制 运动估计与补偿 

分 类 号:TN919.81[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象