基于三元自注意力的视频快照压缩成像重建被引量：1

Reconstruction of Video Snapshot Compressive Imaging Based on Triple Self-Attention

作　　者：周宇谢威邝得互江健民 ZHOU Yu;XIE Wei;Kwong Tak Wu;JIANG Jianmin(College of Computer Science and Software Engineering,Shenzhen University,Shenzhen 518000,Guangdong,China;Department of Computing and Decision Sciences,Lingnan University,Hong Kong 999077,China)

机构地区：[1]深圳大学计算机与软件学院,广东深圳518000 [2]岭南大学计算与决策科学系,中国香港999077

出　　处：《计算机工程》2025年第1期20-30,共11页Computer Engineering

基　　金：国家自然科学基金重点项目(62032015);深圳市科创委基础研究面上项目(JCYJ20220810112354002)。

摘　　要：视频快照压缩成像(SCI)是一种基于计算的成像技术,通过在时间域和空间域上的混合压缩来实现高效成像。在视频SCI中,利用信号的稀疏性以及它在时间域和空间域中的相关性并采用合适的视频SCI算法,有效地重建原始视频信号。虽然基于深度学习的重建算法在多数任务中取得了良好的效果,但是还存在过高的模型复杂度和较慢的重建速度。为解决这些问题,提出一个基于三元自注意力的视频快照压缩成像重建网络模型SCT-SCI,利用多分支分组自注意力机制来利用时间域和空间域的相关性。SCT-SCI模型由一个特征提取模块、一个视频重建模块和多个三元自注意力模块SCT-Block组成。每个SCT-Block由一个窗口自注意力分支、一个通道自注意力分支和一个时序自注意力分支组成,同时引入空间聚合模块SC-2DFusion和全局聚合模块SCT-3DFusion加强特征融合。实验结果显示,在模拟视频数据集上,该模型具有低复杂度的优势,在保证接近的重建质量的前提下相比EfficientSCI模型节省了31.58%的重建时间,提升了实时性能。Video Snapshot Compressive Imaging(SCI)is a computational imaging technique that achieves efficient imaging through hybrid compression in both temporal and spatial domains.In video SCI,the sparsity of the signal and its correlations in the temporal and spatial domains can be exploited to effectively reconstruct the original video signal using appropriate video snapshot SCI algorithms.Although recent deep learning-based reconstruction algorithms have achieved state-of-the-art results in many tasks,they still face challenges related to excessive model complexity and slow reconstruction speeds.To address these issues,this research proposes a reconstruction network model for SCI based on triple self-attention,called SCT-SCI.It employs a multibranch-grouped self-attention mechanism to leverage the correlation in the spatial and temporal domains.The SCT-SCI model comprises a feature extraction module,a video reconstruction module,and a triple self-attention module,called SCT-Block.Each SCT-Block comprises a window self-attention branch,a channel self-attention branch,and a temporal self-attention branch.Additionally,it introduces a spatial fusion module,called SC-2DFusion,and a global fusion module,called SCT-3DFusion,to enhance feature fusion.The experimental results show that on the simulated video dataset,the proposed model demonstrates an advantage in low complexity.It saves 31.58%of the reconstruction time compared to the EfficientSCI model,while maintaining a similar reconstruction quality,thus improving real-time performance.

关键词：快照压缩成像压缩感知 Transformer架构深度学习特征融合

分类号：TP399[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于三元自注意力的视频快照压缩成像重建被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于三元自注意力的视频快照压缩成像重建 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于三元自注意力的视频快照压缩成像重建被引量：1