检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李雯 程旭 胡晓宇 LI Wen;CHENG Xue;HU Xiao-yu(School of Computer Science,Nanjing University of Information Science and Technology,Nanjing 210044,China;Nanjing Institute of Intelligent Technology,Nanjing 210000,China)
机构地区:[1]南京信息工程大学计算机学院,江苏南京210044 [2]中科南京智能技术研究院,江苏南京210000
出 处:《中国电子科学研究院学报》2025年第1期83-91,共9页Journal of China Academy of Electronics and Information Technology
摘 要:随着Kinect等深度传感器的推出,对于包含手部关节3D坐标的骨骼数据的研究越来越广泛,许多研究是基于通过计算关节之间的依赖关系来开发基于骨架的手势识别系统,然而,由于这些方法在捕捉关节间复杂空间关系和时间依赖性方面的局限性,往往提取到的特征效率较低,在实现高性能和通用性方面可能面临困难。本文提出了一种改进的手势识别模型,采用多头注意力机制,并利用多尺度空洞卷积增强对手部关节复杂依赖关系的捕捉,显著提升了识别性能和泛化能力。其次通过多头注意力机制对关节点间的空间关系进行编码,构建全连接图,从而精细捕捉关节点间的复杂依赖关系。同时引入具有不同膨胀率的并行多尺度卷积层,可以有效地捕获多项时间信息,从而提高模型对动态变化的感知能力。此外,为进一步增强模型在动态手势识别中的鲁棒性,采用了一种基于注意力加权的交叉熵损失函数。实验结果表明,在DHG动态手势数据集上,针对14个分类手势,模型取得了97.8%的分类准确率,较其他算法平均提升了7%,针对28个分类手势,模型取得了92.1%的分类准确率,较其他算法平均提升了5%。The advent of depth sensors such as Kinect has led to a proliferation of research on skeleton data comprising 3D coordinates of hand joints.A considerable number of studies are predicated on the development of skeleton-based gesture recognition systems through the calculation of dependencies between joints.However,the limitations of these methods in capturing intricate spatial relationships and temporal dependencies between joints have resulted in the extraction of features that are often less efficient and may encounter challenges in attaining high performance and generalizability.In this paper,we propose an enhanced gesture recognition model that employs a multi-head attention mechanism and enhances the capture of complex dependencies between hand joints using multi-scale null convolution,which markedly improves the recognition performance and generalization ability.Secondly,the spatial relationship between joints is encoded by the multi-head attention mechanism,which constructs a fully connected graph to finely capture the complex dependencies between joints.Furthermore,parallel multi-scale convolutional layers with varying expansion rates are incorporated,enabling the model to effectively capture multiple temporal information,thereby enhancing its capacity to perceive dynamic changes.Furthermore,to enhance the model's robustness in dynamic gesture recognition,a cross-entropy loss function based on attention weighting is employed.The experimental results demonstrate that the model exhibits superior performance on the DHG dynamic gesture dataset,achieving a classification accuracy of 97.8%for 14 categorized gestures,which represents an average improvement of 7%over other algorithms.Furthermore,the model demonstrates an average improvement of 5%over other algorithms in terms of classification accuracy for 28 categorized gestures,reaching a classification accuracy of 92.1%.
关 键 词:动态手势识别 手部骨骼点 时空注意力 多尺度空洞卷积 深度学习
分 类 号:TP399[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.118.31.32