可分离长短期注意力网络的手势识别方法被引量：3

Gesture recognition method with separable long short-term attention networks

作　　者：顾明李轶群张二超张训雷齐林[3] 帖云 GU Ming;LI Yiqun;ZHANG Erchao;ZHANG Xunlei;QI Lin;TIE Yun(Henan Communications Investment Group Company Limited,Zhengzhou Henan 450016,China;Zhengzhou Branch,Zhongxun Post&Telecommunication Consulting&Design Institute Company Limited,Zhengzhou Henan 450000,China;School of Information Engineering,Zhengzhou University,Zhengzhou Henan 450001,China)

机构地区：[1]河南交通投资集团有限公司,郑州450016 [2]中讯邮电咨询设计院有限公司郑州分公司,郑州450000 [3]郑州大学信息工程学院,郑州450001

出　　处：《计算机应用》2022年第S01期59-63,共5页journal of Computer Applications

摘　　要：在人机交互领域中,大多数手势识别算法无法有效地消除采集背景对待提取手势区域的影响。与此同时,对手势运动信息的准确建模也存在困难。针对目前人机交互中的上述问题,提出利用深度可分离残差卷积长短期记忆(LSTM)网络的方法对动态手势的特征信息进行建模和识别。首先,利用常规3D卷积操作对输入的视频帧进行特征的初步提取,通过较大的卷积核尺寸以扩大其感受野;然后,通过可分离卷积残差操作对输入的浅层特征进行特征的再提取,实现对高维特征的提取建模;最后,将经过前两个阶段提取出的特征信息经过3D池化操作后输入到LSTM网络中,对输入的视频数据的时序信息进行建模,并在输入中引入注意力机制。在大规模孤立手势数据集上进行的相关实验结果表明,所提方法的准确率与原始的围绕稀疏关键点的混合特征(MFSK)+视觉词袋(BoVW)+支持向量机(SVM)网络相比提高了21.02个百分点。Most gesture recognition algorithms in the human-computer interaction field cannot effectively eliminate the influence of the acquisition background on the extraction gesture area.At the same time,it is difficult to accurately model the motion information of the gesture.In view of the above problems in human-computer interaction,separable Long Short-Term Memory(LSTM)network for gesture recognition was proposed to model and recognize the feature information.First,the preliminary extraction of the input video frame by conventional 3D convolution operation was carried out.A large convolutional size was chosen to expand the receptive field.Then,the shallow features were re-extracted with separable convolutional residual operation and constructed the model of high-dimensional features.Finally,the feature information extracted through the first two steps was entered into a LSTM network after 3D pooling.The timing information of the video data was modeled,and attention mechanism was introduced into the input.Experimental results on a large-scale isolated gesture dataset show that the accuracy of the proposed method is 21.02 percentage points higher than that of the original MFSK(Mixed features around Sparse Keypoints)+BoVW(Bag of Visual Words)+SVM(Support Vector Machine)network.

关键词：深度残差网络可分离卷积长短期记忆网络动态手势识别注意力机制

分类号：TP37[自动化与计算机技术—计算机系统结构]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

可分离长短期注意力网络的手势识别方法被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

可分离长短期注意力网络的手势识别方法 被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

可分离长短期注意力网络的手势识别方法被引量：3