基于混合Transformer模型的三维视线估计

3D gaze estimation based on hybrid Transformer model

作　　者：童立靖[1] 王清河冯金芝 TONG Lijing;WANG Qinghe;FENG Jinzhi(School of Information,North China University of Technology,Beijing 100144,China)

机构地区：[1]北方工业大学信息学院,北京100144

出　　处：《中南民族大学学报（自然科学版）》2024年第1期97-103,共7页Journal of South-Central University for Nationalities：Natural Science Edition

基　　金：北京市自然科学基金资助项目(4194076);北京市属高校“青年拔尖人才”培养计划资助项目(CIT&TCD201904009)。

摘　　要：针对当前在无约束环境中,进行视线估计任务时准确度不高的问题,提出了一种基于混合Transformer模型的视线估计方法.首先,对MobileNet V3网络进行改进,增加了坐标注意力机制,提高MobileNet V3网络特征提取的有效性;再利用改进的MobileNet V3网络从人脸图像中提取视线估计特征;其次,对Transformer模型的前向反馈神经网络层进行改进,加入一个卷积核大小为3×3的深度卷积层,来提高全局特征整合能力;最后,将提取到的特征输入到改进后的Transformer模型进行整合处理,输出三维视线估计方向.在MPIIFaceGaze数据集上进行评估,该方法的视线估计角度平均误差为3.56°,表明该模型能够较为准确地进行三维视线估计.Aiming at the issue of low accuracy in gaze estimation tasks in unconstrained environments,a gaze estimation method based on hybrid Transformer model is proposed.First,the MobileNet V3 network is improved by adding a coordinate attention module to enhance the effectiveness of feature extraction.Then,the improved MobileNet V3 network is used to extract gaze estimation features from facial images.Subsequently,the forward feed neural network layer of the Transformer model is enhanced by incorporating a 3×3 depthwise convolution layer to escalate the overall feature integration capability.Finally,the extracted features are inputted into the improved Transformer model for integrated processing,and the 3D gaze estimation direction is outputted.The method is evaluated on the MPIIFaceGaze dataset,and the average error of gaze estimation angle is 3.56°,indicating that the model can accurately perform 3D gaze estimation.

关键词：三维视线估计坐标注意力深度卷积

分类号：TP391.41[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于混合Transformer模型的三维视线估计

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于混合Transformer模型的三维视线估计

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索