融合时空领域知识与数据驱动的骨架行为识别  

Fusion of Spatio-Temporal Domain Knowledge and Data-Driven for Skeleton-BasedAction Recognition

在线阅读下载全文

作  者:梁成武 胡伟 杨杰 蒋松琪 侯宁 LIANG Chengwu;HUWei;YANG Jie;JIANG Songqi;HOU Ning(College of Electrical Engineering and New Energy,China Three Gorges University,Yichang,Hubei 443002,China;School of Electrical and Control Engineering,Henan University of Urban Construction,Pingdingshan,Henan 467036,China)

机构地区:[1]三峡大学电气与新能源学院,湖北宜昌443002 [2]河南城建学院电气与控制工程学院,河南平顶山467036

出  处:《计算机工程与应用》2025年第5期165-176,共12页Computer Engineering and Applications

基  金:国家自然科学基金(62176086,U1804152)。

摘  要:基于骨架数据的行为识别由于其数据紧凑性和抗背景干扰性,逐渐引起研究者的关注。现有数据驱动方法对融合骨架行为的时空领域知识尚未充分研究。基于此,提出一种融合人体行为时空领域先验知识与CNN改进网络结构的骨架行为识别方法。基于关键时空特征领域知识提出时通聚焦模块,通过产生聚集系数矩阵引导模型关注鉴别性特征表达。融合长时空跨度领域知识提出多尺度卷积融合模块,沿通道采用分组残差连接方式灵活扩大卷积的时间感受野,在不引入大量参数情况下可获得长时空跨度特征表达能力。该方法在NTU RGB+D、NTU RGB+D 120及FineGYM三个大型数据集上进行性能评估与验证,分别取得96.6%、89.6%、94.1%的识别准确率。实验结果表明,融合时空领域知识与数据驱动可充分挖掘骨架行为时空特征,能够提升骨架行为识别性能并具有跨数据集泛化性。Action recognition based on skeleton data has gradually attracted the attention of researchers due to its data compactness and resistance to background interference.Existing data-driven methods for fusing spatio-temporal domain knowledge of skeleton actions have not been fully investigated.Based on this,this paper proposes a skeleton action recognition method that fuses spatio-temporal domain priori knowledge of human actions with an improved CNN network structure.Firstly,a temporal channel focusing module is proposed based on key spatio-temporal feature domain knowledge,which guides the model to focus on discriminative feature expression by generating an aggregation coefficient matrix.Then,a multi-scale convolutional fusion module is proposed by integrating the long spatio-temporal span domain knowledge,and the temporal sense field of convolution is flexibly expanded by using grouped residual connection along the channel,so that the long spatio-temporal span feature expression capability can be obtained without introducing a large number of parameters.The method in this paper is evaluated and validated on three large datasets,NTU RGB+D,NTU RGB+D 120 and FineGYM,and achieves recognition accuracies of 96.6%,89.6%and 94.1%,respectively.The results show that the fusion of spatio-temporal domain knowledge and data-driven can fully explore the spatio-temporal features of skeleton action,and can improve the performance of skeleton action recognition with cross-dataset generalizability.

关 键 词:时空领域知识 数据驱动 骨架行为识别 卷积神经网络 长时空建模 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象