检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:顾明 李轶群 张二超 张训雷 齐林[3] 帖云 GU Ming;LI Yiqun;ZHANG Erchao;ZHANG Xunlei;QI Lin;TIE Yun(Henan Communications Investment Group Company Limited,Zhengzhou Henan 450016,China;Zhengzhou Branch,Zhongxun Post&Telecommunication Consulting&Design Institute Company Limited,Zhengzhou Henan 450000,China;School of Information Engineering,Zhengzhou University,Zhengzhou Henan 450001,China)
机构地区:[1]河南交通投资集团有限公司,郑州450016 [2]中讯邮电咨询设计院有限公司郑州分公司,郑州450000 [3]郑州大学信息工程学院,郑州450001
出 处:《计算机应用》2022年第S01期59-63,共5页journal of Computer Applications
摘 要:在人机交互领域中,大多数手势识别算法无法有效地消除采集背景对待提取手势区域的影响。与此同时,对手势运动信息的准确建模也存在困难。针对目前人机交互中的上述问题,提出利用深度可分离残差卷积长短期记忆(LSTM)网络的方法对动态手势的特征信息进行建模和识别。首先,利用常规3D卷积操作对输入的视频帧进行特征的初步提取,通过较大的卷积核尺寸以扩大其感受野;然后,通过可分离卷积残差操作对输入的浅层特征进行特征的再提取,实现对高维特征的提取建模;最后,将经过前两个阶段提取出的特征信息经过3D池化操作后输入到LSTM网络中,对输入的视频数据的时序信息进行建模,并在输入中引入注意力机制。在大规模孤立手势数据集上进行的相关实验结果表明,所提方法的准确率与原始的围绕稀疏关键点的混合特征(MFSK)+视觉词袋(BoVW)+支持向量机(SVM)网络相比提高了21.02个百分点。Most gesture recognition algorithms in the human-computer interaction field cannot effectively eliminate the influence of the acquisition background on the extraction gesture area.At the same time,it is difficult to accurately model the motion information of the gesture.In view of the above problems in human-computer interaction,separable Long Short-Term Memory(LSTM)network for gesture recognition was proposed to model and recognize the feature information.First,the preliminary extraction of the input video frame by conventional 3D convolution operation was carried out.A large convolutional size was chosen to expand the receptive field.Then,the shallow features were re-extracted with separable convolutional residual operation and constructed the model of high-dimensional features.Finally,the feature information extracted through the first two steps was entered into a LSTM network after 3D pooling.The timing information of the video data was modeled,and attention mechanism was introduced into the input.Experimental results on a large-scale isolated gesture dataset show that the accuracy of the proposed method is 21.02 percentage points higher than that of the original MFSK(Mixed features around Sparse Keypoints)+BoVW(Bag of Visual Words)+SVM(Support Vector Machine)network.
关 键 词:深度残差网络 可分离卷积 长短期记忆网络 动态手势识别 注意力机制
分 类 号:TP37[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.219.83.70