基于骨架的双模注意力时空图卷积网络人体动作识别方法  被引量:1

Dual-mode attention spatio-temporal graph convolutional networks for skeleton-based action recognition

在线阅读下载全文

作  者:石祥滨 刘宏哲 SHI Xiang-bin;LIU Hong-zhe(College of Computer Science,Shenyang Aerospace University,Shenyang 110136,China)

机构地区:[1]沈阳航空航天大学计算机学院,沈阳110136

出  处:《沈阳航空航天大学学报》2023年第1期58-66,共9页Journal of Shenyang Aerospace University

基  金:国家自然科学基金(项目编号:61170185)。

摘  要:人体动作识别已成为当今热门研究领域之一,基于骨架的人体动作识别方法因其能够明确展现人体动作而备受关注。针对提取特征时手工设计的人体拓扑图无法获取全局信息、存在大量冗余信息等问题,提出一个双模注意力时空图卷积网络,充分利用了对动作识别起关键作用的节点信息。首先,提出SGSAE模块,使用自注意力机制对所有关节点之间的关系进行建模,实现对节点信息全局特征的提取,并且在网络的训练过程中优化图的拓扑结构,最终获得适应各种数据样本的图拓扑结构;其次,按不同权重融合节点的全局特征与局部特征;最后,将通道注意力机制引入到网络模型中,提出MCA模块融合通道特征,减少大量冗余信息,提高动作识别精度。实验结果表明,双模注意力时空图卷积网络在NTU-RGB+D和Kinetics数据集上取得了较好的动作识别效果。Human action recognition has become one of the hot research areas today.The skeleton-based human action recognition method has been widely noticed for its ability to show human action clearly.In response to the problem that the manually designed human topology graph cannot obtain global information and has a lot of redundant information when extracting features,a dual-mode attention spatio-temporal graph convolutional network is proposed to achieve the full utilization of node information that plays a key role in action recognition.First,the SGSAE module is proposed to model the relationship between all nodes using a self-attention mechanism to achieve the extraction of global features of node information,and to optimize the topology of the graph during the training process of the network to finally obtain a graph topology that adapts to various data samples.Secondly,the global and local features of the nodes are fused by different weights.Finally,the channel attention mechanism is introduced into the network,and the MCA module is proposed to fuse channel features to reduce a large amount of redundant information and improve action recognition accuracy.The experimental results show that the proposed dual-mode attention spatiotemporal graph convolutional network achieves better performance on NTU-RGB+D and Kinetics datasets.

关 键 词:动作识别 图卷积网络 自注意力机制 通道注意力机制 特征融合 

分 类 号:TP399[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象