用于骨架行为识别的时空卷积Transformer网络被引量：1

Spatial temporal convolutional Transformer network for skeleton-based action recognition

作　　者：刘斌斌赵宏涛王田杨艺[2] Liu Binbin;Zhao Hongtao;Wang Tian;Yang Yi(Zhengzhou Hengda Intelligent Control Technology Company Limited,Zhengzhou 450000,China;School of Electrical Engineering and Automation,Henan Polytechnic University,Jiaozuo 454003,China;Research Institute for Artificial Intelligence,Beihang University,Beijing 100191,China)

机构地区：[1]郑州恒达智控科技股份有限公司,郑州450000 [2]河南理工大学电气工程与自动化学院,焦作454003 [3]北京航空航天大学人工智能研究院,北京100191

出　　处：《电子测量技术》2024年第1期169-177,共9页Electronic Measurement Technology

基　　金：国家自然科学基金(61972016)项目资助。

摘　　要：针对基于图卷积的骨架行为识别方法在建模关节特征时严重依赖手工设计图形拓扑,缺乏建模全局关节间依赖关系的缺点,设计了一种时空卷积Transformer实现对空间和时间关节特征的建模。空间关节特征建模中,提出一种动态分组解耦Transformer,通过将输入骨架序列在通道维度进行分组并为每个组动态生成不同的注意力矩阵,允许建模关节之间的全局空间依赖关系,无需事先知道人体拓扑结构。时间关节特征建模中,通过多尺度时间卷积实现对不同时间尺度行为特征的提取。最后,提出一种时空-通道联合注意力模块,进一步对所提取到的时空特征进行修正。在NTU-RGB+D和NTU-RGB+D 120数据集的跨主体评估标准上达到了92.5%和89.3%的Top1识别准确率,实验结果表明了所提方法的有效性。In the methon of skeleton action recognition based on graph convolution,the rely heavily on hand-designed graph topology in modelling joint features,and lack the ability to model global joint dependencies.To address this issue,we proposed a spatio-temporal convolutional Transformer network to implement the modelling of spatial and temporal joint features.In the spatial joint feature modeling,we proposed a dynamic grouping decoupling Transformer that grouped the input skeleton sequence in the channel dimension and dynamically generated different attention matrices for each group,establishing global dependencies between joints without requiring knowledge of the human topology.In the temporal joint feature modeling,multi-scale temporal convolution was used to extract features of target behaviors at different scales.Finally,we proposed a spatio-temporal channel joint attention module to further refine the extracted spatio-temporal features.The proposed method achieved Top1 recognition accuracy rates of 92.5% and 89.3% on the cross-subject evaluation criteria for the NTU-RGB+D and NTU-RGB+D 120 datasets,respectively,demonstrating its effectiveness.

关键词：行为识别人体骨架自注意机制 TRANSFORMER

分类号：TP391.4[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

用于骨架行为识别的时空卷积Transformer网络被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

用于骨架行为识别的时空卷积Transformer网络 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

用于骨架行为识别的时空卷积Transformer网络被引量：1