基于时空增强双分支图卷积网络的骨骼行为识别  

Spatiotemporally Enhanced Dual-branch Graph Convolutional Network for Skeleton-based Action Recognition

作  者:施宇航 何强 王恒友 SHI Yuhang;HE Qiang;WANG Hengyou(School of Science,Beijing University of Civil Engineering and Architecture,Beijing 102616,China;Institute of Big Data Modeling Theory and Technology,Beijing University of Civil Engineering and Architecture,Beijing 102616,China)

机构地区:[1]北京建筑大学理学院,北京102616 [2]北京建筑大学大数据建模理论与技术研究所,北京102616

出  处:《山西大学学报(自然科学版)》2025年第1期55-65,共11页Journal of Shanxi University(Natural Science Edition)

基  金:国家自然科学基金(62072024,12301581);北京市教育委员会科学研究计划项目(KM202210016002);北京建筑大学硕士研究生创新项目(09081024002)。

摘  要:针对现有基于骨骼行为识别的图卷积的方法存在关节划分固定、重视空间信息而忽视时间信息并且网络参数量较高等问题。首先引入对称关节的信息,增加对称动作的交互特征;其次,加入多尺度金字塔(Multi-scale Pyramid,MSP)时间图卷积模块,形成双分支(Dual-branch,DB)的网络结构,提高网络对时间维度的信息提取能力;最后,本研究利用特征映射和空间聚合(Feature Mapping and Spatial Aggregation,FM-SA),在保留原始拓扑结构信息的前提下,过滤了权重矩阵中的冗余部分,并添加了挤压-激励(Squeeze-and-Excitation,SE)模块,从而有效提升了空间特征的提取能力和特征图的表达能力。实验结果表明,与基准模型相比,网络参数量减少51%,在NTU RGB+D 120数据集上的关节、骨骼流的识别准确率分别提高了0.5%和1.3%,融合准确率提高0.7%,0.5%,在NTU RGB+D、Northwestern-UCLA(NW-UCLA)数据集的识别准确率分别提升0.1%,0.2%,1.5%。本文模型的有效性和可行性得到验证。There are issues with existing graph convolution methods for skeleton-based action recognition,such as fixed joint segmentation,an emphasis on spatial information while neglecting temporal information,and a high number of network parameters.To address these issues,firstly,the information of symmetric joint is introduced to increase the interactive features of symmetric action.Secondly,the Multi-scale pyramid(MSP)time graph convolution module is added to form a Dual-branch(DB)network structure to improve the ability of the network to extract time dimension information.Finally,this study employs feature mapping and spatial aggregation(FM-SA)to filter out redundant parts in the weight matrix while preserving the original topological structure information,and incorporate a Squeeze-and-Excitation(SE)module to effectively enhance the extraction of spatial features and the expressive power of the feature maps.The experimental results show that compared with the benchmark model,the number of network parameters is reduced by 51%,the recognition accuracy of joint and bone flow on the NTU RGB+D 120 dataset is increased by 0.5%and 1.3%,and the fusion accuracy is increased by 0.7%and 0.5%.The recognition accuracy of NTU RGB+D and NW-UCLA datasets is increased by 0.1%,0.2%and 1.5%,respectively.The validity and feasibility of this model are verified.

关 键 词:骨骼行为识别 关节分区 时空信息增强 多尺度金字塔 映射聚合 

分 类 号:O436[机械工程—光学工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象