智能人机交互中第一视角手势表达的一次性学习分类识别  被引量:6

One-shot Learning Classification and Recognition of Gesture Expression From the Egocentric Viewpoint in Intelligent Human-computer Interaction

在线阅读下载全文

作  者:鹿智 秦世引[1,2] 李连伟 张鼎豪 LU Zhi;QIN Shi-Yin;LI Lian-Wei;ZHANG Ding-Hao(School of Automation Science and Electrical Engineering,Beihang University,Beijing 100191;School of Electrical En-gineering and Intelligentization,Dongguan University of Technology,Dongguan 523808;School of Electronic Information Engineering,Beihang University,Beijing 100191)

机构地区:[1]北京航空航天大学自动化科学与电气工程学院,北京100191 [2]东莞理工学院电子工程与智能化学院,东莞523808 [3]北京航空航天大学电子信息工程学院,北京100191

出  处:《自动化学报》2021年第6期1284-1301,共18页Acta Automatica Sinica

基  金:国家自然科学基金重点项目(61731001)资助。

摘  要:在智能人机交互中,以交互人的视角为第一视角的手势表达发挥着重要作用,而面向第一视角的手势识别则成为最重要的技术环节.本文通过深度卷积神经网络的级联组合,研究复杂应用场景中第一视角下的一次性学习手势识别(Oneshot learning hand gesture recognition,OSLHGR)算法.考虑到实际应用的便捷性和适用性,运用改进的轻量级SSD(Single shot multibox detector)目标检测网络实现第一视角下手势目标的快速精确检测;进而,以改进的轻量级U-Net网络为主要工具进行复杂背景下手势目标的像素级高效精准分割.在此基础上,以组合式3D深度神经网络为工具,研究提出了一种第一视角下的一次性学习手势动作识别的网络化算法.在Pascal VOC 2012数据集和SoftKinetic DS325采集的手势数据集上进行的一系列实验测试结果表明,本文所提出的网络化算法在手势目标检测与分割精度、分类识别准确率和实时性等方面都有显著的优势,可为在复杂应用环境下实现便捷式高性能智能人机交互提供可靠的技术支持.In intelligent human-computer interaction(HCI),the expression of gestures with the perspective of the interactive person as the egocentric viewpoint plays an important role,while gesture recognition from the egocentric viewpoint becomes the most important technical link.In this paper,one-shot learning hand gesture recognition(OSLHGR)algorithm under the egocentric viewpoint in complex application scenarios is studied through the cascade combination of deep convolutional neural networks(CNN).Considering the convenience and applicability of practical applications,the improved lightweight SSD(single shot multibox detector)detection network was utilized to achieve rapid and accurate gesture object detection.Furthermore,the improved lightweight U-Net network is used as the main tool to perform pixel-level efficient and accurate segmentation of gesture targets in complex backgrounds.On the basis of U-Net results,a networked algorithm for OSLHGR from the egocentric viewpoint is proposed by using the combined 3D deep neural network.A series of experimental results on the Pascal VOC 2012 dataset and the gesture dataset collected by SoftKinetic DS325 show that the proposed networked algorithm has significant advantages in gesture target detection and segmentation precision,classification accuracy and real-time performance.It can provide reliable technical support for the realization of convenient and high-performance intelligent HCI in complex application environment.

关 键 词:智能人机交互 第一视角 深度卷积神经网络 目标检测与分割 一次性学习手势识别 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术] TP183[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象