检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:施宇航 何强 王恒友 SHI Yuhang;HE Qiang;WANG Hengyou(School of Science,Beijing University of Civil Engineering and Architecture,Beijing 102616,China;Institute of Big Data Modeling Theory and Technology,Beijing University of Civil Engineering and Architecture,Beijing 102616,China)
机构地区:[1]北京建筑大学理学院,北京102616 [2]北京建筑大学大数据建模理论与技术研究所,北京102616
出 处:《山西大学学报(自然科学版)》2025年第1期55-65,共11页Journal of Shanxi University(Natural Science Edition)
基 金:国家自然科学基金(62072024,12301581);北京市教育委员会科学研究计划项目(KM202210016002);北京建筑大学硕士研究生创新项目(09081024002)。
摘 要:针对现有基于骨骼行为识别的图卷积的方法存在关节划分固定、重视空间信息而忽视时间信息并且网络参数量较高等问题。首先引入对称关节的信息,增加对称动作的交互特征;其次,加入多尺度金字塔(Multi-scale Pyramid,MSP)时间图卷积模块,形成双分支(Dual-branch,DB)的网络结构,提高网络对时间维度的信息提取能力;最后,本研究利用特征映射和空间聚合(Feature Mapping and Spatial Aggregation,FM-SA),在保留原始拓扑结构信息的前提下,过滤了权重矩阵中的冗余部分,并添加了挤压-激励(Squeeze-and-Excitation,SE)模块,从而有效提升了空间特征的提取能力和特征图的表达能力。实验结果表明,与基准模型相比,网络参数量减少51%,在NTU RGB+D 120数据集上的关节、骨骼流的识别准确率分别提高了0.5%和1.3%,融合准确率提高0.7%,0.5%,在NTU RGB+D、Northwestern-UCLA(NW-UCLA)数据集的识别准确率分别提升0.1%,0.2%,1.5%。本文模型的有效性和可行性得到验证。There are issues with existing graph convolution methods for skeleton-based action recognition,such as fixed joint segmentation,an emphasis on spatial information while neglecting temporal information,and a high number of network parameters.To address these issues,firstly,the information of symmetric joint is introduced to increase the interactive features of symmetric action.Secondly,the Multi-scale pyramid(MSP)time graph convolution module is added to form a Dual-branch(DB)network structure to improve the ability of the network to extract time dimension information.Finally,this study employs feature mapping and spatial aggregation(FM-SA)to filter out redundant parts in the weight matrix while preserving the original topological structure information,and incorporate a Squeeze-and-Excitation(SE)module to effectively enhance the extraction of spatial features and the expressive power of the feature maps.The experimental results show that compared with the benchmark model,the number of network parameters is reduced by 51%,the recognition accuracy of joint and bone flow on the NTU RGB+D 120 dataset is increased by 0.5%and 1.3%,and the fusion accuracy is increased by 0.7%and 0.5%.The recognition accuracy of NTU RGB+D and NW-UCLA datasets is increased by 0.1%,0.2%and 1.5%,respectively.The validity and feasibility of this model are verified.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.144.230.177