检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:马磊 MA Lei(School of Information Engineering,Tongren Polytechnic College,Tongren Guizhou 554300,China)
机构地区:[1]铜仁职业技术学院信息工程学院,贵州铜仁554300
出 处:《四川职业技术学院学报》2025年第2期164-168,共5页Journal of Sichuan Vocational and Technical College
摘 要:智能安防、虚拟现实游戏、机器人交互是人体动作行为识别的重要应用场景.近年来,基于RGB-D成像视频、人体骨架时序信息和多传感器数据融合的多模态深度神经网络模型在人体行为动作识别任务上表现优异.基于骨架信息具有时序变化的特点,提出了一种基于时空图卷积与多模态融合的人体动作识别模型.模型首先借助先验知识将传感器数据融入骨骼节点的特征,并引入图卷积来处理骨架特征,使用3维卷积来处理RGB-D视频序列.多模态数据特征融合后,输出层使用softmax得到最终的人体动作识别结果.在UTD-MHAD多模态数据集上的实验结果表明,所提模型相较于基准方法在识别准确率上平均提升了4.43%.Intelligent security,virtual reality games,and robot interaction are important application scenarios for human motion and behavior recognition.In recent years,multimodal deep neural network models based on RGB-D imaging videos,human skeleton temporal information,and multi-sensor data fusion have shown excellent performance in human behavior and action recognition tasks.Based on the temporal variation of skeleton information,this paper proposes a human action recognition model based on spatiotemporal graph convolution and multimodal fusion.Firstly,by leveraging prior knowledge,sensor data is integrated into the features of skeletal nodes,and graph convolution is introduced to process skeletal features.3D convolution is used to process RGB-D video sequences.After multimodal data feature fusion,the output layer uses softmax to obtain the final human action recognition result.The experimental results on the UTD-MHAD multimodal dataset show that the proposed model has an average improvement of 4.43%in recognition accuracy compared to the benchmark method.
关 键 词:人体动作识别 多模态融合 图神经网络 传感器应用
分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49