面向骨架图卷积动作识别的跨维交互注意力  

Cross-dimensional interactive attention for skeleton graph convolutional action recognition

在线阅读下载全文

作  者:韩守东[1] 龚雨舟 谢云飞 李洪全 HAN Shoudong;GONG Yuzhou;XIE Yunfei;LI Hongquan(School of Artificial Intelligence and Automation,Huazhong University of Science and Technology,Wuhan 430074,China;Shandong Electric Group New Energy Technology Co.Ltd,Jinan 250000,China)

机构地区:[1]华中科技大学人工智能与自动化学院,湖北武汉430074 [2]山东电工电气集团新能科技有限公司,山东济南250000

出  处:《华中科技大学学报(自然科学版)》2024年第11期93-100,共8页Journal of Huazhong University of Science and Technology(Natural Science Edition)

基  金:济南市科技计划资助项目(202214002);多谱信息智能处理技术全国重点实验室基金资助项目(6142113220208)。

摘  要:针对骨架动作识别任务中传统图卷积网络未能充分挖掘骨架特征在不同维度之间交互语义的问题,提出一种跨维交互注意力(CDIA).CDIA包含了三种子注意力:空间-通道分组注意力(S-CGA)关联了骨架不同子图的内部节点之间和子图之间局部与全局的交互特征;时序-空间位移注意力(T-SSA)建立了帧间一阶位姿特征的上下文依赖;时序-通道差分注意力(T-CDA)增强了帧间二阶动态特征的表达.实验结果表明:在NTU 60的X-Sub和XView基准及NTU 120的X-Sub和X-Set基准上,CDIA相对基线网络的识别精度分别提升3.2%,1.1%和0.9%,1.8%,在FineGYM数据集上提升3.3%,计算量与参数量仅产生微小增加,可集成于不同图卷积网络,具备轻量化、即插即用的特性和优秀的识别性能.Aiming at the problem that traditional graph convolutional networks failed to fully exploit the interaction semantics of skeleton features between different dimensions in skeleton-based action recognition,a cross-dimensional interactive attention(CDIA)was proposed.CDIA included three sub-attentions,in which spatial-channel grouped attention(S-CGA)correlated local and global interaction features between nodes within different subgraphs and among subgraphs and subgraphs of the human skeleton,temporal-spatial displacement attention(T-SSA)established the contextual dependence of the first-order positional features between frames,and temporal-channel differential attention(T-CDA)enhanced the representation of the second-order dynamic features between frames.Experiment results show that,on the X-Sub and X-View benchmarks of NTU 60 and X-Sub and X-Set benchmarks of NTU 120,CDIA improves recognition accuracy by 3.2%,1.1%and 0.9%,1.8%relative to the baseline network,respectively,and by 3.3%on the FineGYM dataset,with a slight increase of computational cost and parameters,which can be integrated into different graph convolutional networks with lightweight,plug-and-play properties and excellent recognition performance.

关 键 词:骨架动作识别 图卷积网络 跨维交互注意力 局部与全局 位姿特征关联 帧间差分 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象