一种基于MLP的高效高精度三维视线估计方法  被引量:2

An efficient and high-precision 3D gaze estimation method based on MLP

在线阅读下载全文

作  者:吴志豪 张德军 吴亦奇 陈壹林 WU Zhi-hao;ZHANG De-jun;WU Yi-qi;CHEN Yi-lin(School of Computer Science,China University of Geosciences,Wuhan 430078;Hubei Key Laboratory of Intelligent Robot(Wuhan Institute of Technology),Wuhan 430205,China)

机构地区:[1]中国地质大学(武汉)计算机学院,湖北武汉430078 [2]智能机器人湖北省重点实验室(武汉工程大学),湖北武汉430205

出  处:《计算机工程与科学》2023年第11期1982-1990,共9页Computer Engineering & Science

基  金:国家自然科学基金(61802355);智能机器人湖北省重点实验室开放基金(HBIR 202105)。

摘  要:随着卷积神经网络(CNN)在计算机视觉领域的广泛应用,以及大量三维视线数据集的公开,基于表观和深度学习相结合的三维视线估计研究受到越来越多的关注。由于CNN结构复杂,这类方法在实时性要求较高的应用场景中还有待进一步改进。近来兴起的研究表明,网络结构更为简单的多层感知机(MLP)模型能够取得与当前最佳CNN、Transformer模型相当的性能。受此启发,提出了一种基于MLP的高效高精度三维视线估计方法,利用MLP模型对双眼、人脸图像提取特征,之后融合推导出三维视线。实验结果表明,对MPIIFaceGaze数据集和EyeDiap数据集中包含的31位不同相貌的受试者,使用提出的方法UM-Net进行视线估计,视线估计精度比肩基于CNN的,并且在视线估计速度上具有明显优势,在实时性要求较高的领域也有较好的应用前景。With the wide application of convolutional neural network(CNN)in the field of computer vision and the release of a large number of 3D gaze datasets,research on 3D gaze estimation based on the combination of apparent and deep learning has received more and more attention.However,due to the complex structure of CNN,such methods need to be further improved in occasions with high real-time requirements.Recent studies have shown that MLP models with simpler structures can achieve performance comparable to the current best CNN and Transformer models.Inspired by this,an efficient and high-precision 3D gaze estimation method based on MLP is proposed.The MLP model is used to extract features from face and binocular images and then fuse them to derive 3D gaze.Experiment shows that,for the 31 subjects with different appearance characteristics in MPIIFaceGaze dataset and EyeDiap dataset,the proposed method UM-Net achieves gaze estimation accuracy that is comparable to CNNs-based method,and it has obvious advantages in gaze estimation speed.Therefore,it has a good application prospect in fields with high real-time requirements.

关 键 词:三维视线估计 表观 多层感知机 实时性 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象