检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:章惠 张娜娜[2] 黄俊 ZHANG Hui;ZHAG Nana;HUANG Jun(College of Information Technology,Shanghai Ocean University,Shanghai 201306,China;College of Information Technology,Shanghai Jian Qiao University,Shanghai 201306,China)
机构地区:[1]上海海洋大学信息学院,上海201306 [2]上海建桥学院信息技术学院,上海201306
出 处:《计算机应用》2021年第6期1667-1672,共6页journal of Computer Applications
基 金:上海市教育委员会“晨光计划”基金资助项目(AASH1702)。
摘 要:针对在受到部分遮挡或角度过大无法定位面部关键特征点的情况下,传统的头部姿态估计方法的准确率低或无法进行头部姿态估计的问题,提出了优化Le Net-5网络的多角度头部姿态估计方法。首先,通过对卷积神经网络(CNN)的深度、卷积核大小等进行优化来更好地捕捉图像的全局特征;然后,改进池化层,用卷积操作代替池化操作来增强网络的非线性能力;最后,引入Ada Bound优化器,并利用Softmax回归模型做姿态分类训练。训练中在自建数据集中增加遮挡头发、做出夸张表情和佩戴眼镜等动作来增强网络的泛化能力。实验结果表明,所提方法不需要定位面部关键特征点,在光照阴影、头发等遮挡情况下也可以实现抬头、低头、偏头等多角度转动下的头部姿态估计,在Pointing04公共数据集和CAS-PEAL-R1公共数据集上准确率达到了98.7%,运行速度平均在每秒22~29帧。In order to solve the problems that the accuracy is low or the head pose estimation cannot be performed by traditional head pose estimation methods when the key feature points of the face cannot be located due to partial occlusion or too large angle,a multi-angle head pose estimation method based on optimized LeNet-5 network was proposed.Firstly,the depth,the size of the convolution kernel and other parameters of the Convolutional Neural Network(CNN)were optimized to better capture the global features of the image.Then,the pooling layers were improved,and a convolutional operation was used to replace the pooling operation to increase the nonlinear ability of the network.Finally,the AdaBound optimizer was introduced,and the Softmax regression model was used to perform the pose classification training.During the training,hair occlusion,exaggerated expressions and wearing glasses were added to the self-built dataset to increase the generalization ability of the network.Experimental results show that,the proposed method can realize the head pose estimation under multi-angle rotations,such as head up,head down and head tilting without locating key facial feature points,under the occlusion of light,shadow and hair,with the accuracy of 98.7%on Pointing04 public dataset and CAS-PEAL-R1 public dataset,and the average running speed of 22-29 frames per second.
关 键 词:头部姿态估计 面部关键特征点 LeNet-5网络 卷积神经网络 姿态分类
分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.21.113.219