基于混合注意力机制的视频人体动作识别

Video Human Action Recognition Based on Hybrid Attention Mechanism

作　　者：朱联祥牛文煜仝文东邵浩杰 ZHU Lian-xiang;NIU Wen-yu;TONG Wen-dong;SHAO Hao-jie(School of Computer Science,Xi’an Shiyou University,Xi’an 710065,China)

机构地区：[1]西安石油大学计算机学院,陕西西安710065

出　　处：《计算机技术与发展》2023年第9期105-112,共8页Computer Technology and Development

基　　金：移动通信教育部工程研究中心开放研究项目(cqupt-mct-202006)。

摘　　要：C3D作为一种典型的三维卷积神经网络被应用于视频动作识别任务。针对其存在的特征提取不足、易出现过拟合以及识别准确率较低等问题,提出一种融合混合注意力机制的C3D三维卷积网络模型。在原C3D网络插入由GCNet通道注意力模块和3D-Crisscross空间注意力模块构建的混合注意力模块,这两种注意力网络具有全局上下文建模操作,能够对三维特征建立远程依赖关系,加强网络对视频特征在通道和空间上的特征提取能力,提高模型的分类性能。将所提方法在UCF-101和HMDB-51两个大型视频数据集上进行测试,并与深度学习的其他模型进行比较,结果表明,该方法相对于其他深度学习模型具有相对更高的准确率,在UCF-101和HMDB-51数据集上的识别准确率可以达到96.7%和63.3%,而且与原C3D方法相比在效果上有明显提升。As a typical three-dimensional convolutional neural network,C3D has been used in video action recognition tasks widely.To address the issues coming with existing C3D based action recognition methods,such as insufficient feature extraction,prone to over fitting,low recognition accuracy,etc.,a new C3D based network model with the introducing of hybrid attention mechanism fusion is proposed.A hybrid attention module constructed by GCNet channel attention module and 3D-Crisscross spatial attention module is inserted into the original C3D network.These two attention networks have global context modeling operations,can establish remote dependencies on 3D features,strengthen the network’s ability to extract video features in channel and space,and improve the classification performance of the model.The performance of proposed method has been tested on two large video datasets,i.e.UCF-101 and HMDB-51,with the comparison to other deep learning models.Experimental results show the proposed method has a higher recognition accuracy than that of other deep learning models.The recognition accuracy of UCF-101 and HMDB-51 data sets can reach 96.7%and 63.3%,with a significant improvement in vision effect compare to original C3D method.

关键词：人体动作识别三维卷积神经网络全局上下文建模远程依赖注意力机制

分类号：TP391.41[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于混合注意力机制的视频人体动作识别

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于混合注意力机制的视频人体动作识别

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索