检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:童立靖[1] 王清河 冯金芝 TONG Lijing;WANG Qinghe;FENG Jinzhi(School of Information,North China University of Technology,Beijing 100144,China)
出 处:《中南民族大学学报(自然科学版)》2024年第1期97-103,共7页Journal of South-Central University for Nationalities:Natural Science Edition
基 金:北京市自然科学基金资助项目(4194076);北京市属高校“青年拔尖人才”培养计划资助项目(CIT&TCD201904009)。
摘 要:针对当前在无约束环境中,进行视线估计任务时准确度不高的问题,提出了一种基于混合Transformer模型的视线估计方法.首先,对MobileNet V3网络进行改进,增加了坐标注意力机制,提高MobileNet V3网络特征提取的有效性;再利用改进的MobileNet V3网络从人脸图像中提取视线估计特征;其次,对Transformer模型的前向反馈神经网络层进行改进,加入一个卷积核大小为3×3的深度卷积层,来提高全局特征整合能力;最后,将提取到的特征输入到改进后的Transformer模型进行整合处理,输出三维视线估计方向.在MPIIFaceGaze数据集上进行评估,该方法的视线估计角度平均误差为3.56°,表明该模型能够较为准确地进行三维视线估计.Aiming at the issue of low accuracy in gaze estimation tasks in unconstrained environments,a gaze estimation method based on hybrid Transformer model is proposed.First,the MobileNet V3 network is improved by adding a coordinate attention module to enhance the effectiveness of feature extraction.Then,the improved MobileNet V3 network is used to extract gaze estimation features from facial images.Subsequently,the forward feed neural network layer of the Transformer model is enhanced by incorporating a 3×3 depthwise convolution layer to escalate the overall feature integration capability.Finally,the extracted features are inputted into the improved Transformer model for integrated processing,and the 3D gaze estimation direction is outputted.The method is evaluated on the MPIIFaceGaze dataset,and the average error of gaze estimation angle is 3.56°,indicating that the model can accurately perform 3D gaze estimation.
分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.170