基于CB-ViT的青少年视线估计算法研究  

Research on adolescent gaze estimation algorithm based on CB-ViT

在线阅读下载全文

作  者:严青松 毛建华[1] 刘志[1,2] 陆小锋[1,2] YAN Qingsong;MAO Jianhua;LIU Zhi;LU Xiaofeng(School of Communication and Information Engineering,Shanghai University,Shanghai 200444,China;Wenzhou Institute of Shanghai University,Wenzhou 325000,China)

机构地区:[1]上海大学通信与信息工程学院,上海200444 [2]上海大学温州研究院,浙江温州325000

出  处:《现代电子技术》2024年第15期146-150,共5页Modern Electronics Technique

基  金:温州市重大科技创新攻关项目(ZY2023003)。

摘  要:视线估计技术在人机交互、虚拟现实和医学辅助诊断等领域有着广泛应用。然而,现有的公开数据集主要针对成年人,导致基于这些数据集训练的视线估计算法在应用于青少年群体时效果通常不尽如人意。为了解决这一问题,收集了一个名为“Young-Gaze”的青少年视线数据集,涵盖了107位青少年的视线数据。还提出了一种2D视线估计算法,该算法基于ViT并引入了一个名为上下文广播的模块,同时通过融合左眼和右眼的不同层次特征,显著增强了网络模型在特征表达上的能力。在实验中,该算法在Young-Gaze数据集上展现了出色的性能,达到了5.42 cm的误差,性能优于当前其他同类2D视线估计算法。除了在Young-Gaze数据集上取得显著性能外,该算法同样在公开的2D视线估计数据集如GazeCapture和MPIIFaceGaze上进行了训练和测试,也展现了良好的性能,表明该算法不仅适用于青少年群体,也能够在成人群体中得到有效应用。Gaze estimation technology is widely applied in the fields such as human-computer interaction(HCI),virtual reality,and medical diagnostic assistance.However,the existing public datasets are primarily adult-oriented,so the gaze estimation algorithms trained on these datasets show suboptimal performance when applied to adolescents.To address this issue,a youth specific gaze dataset named"Young-Gaze",which encompasses gaze data from 107 adolescents,is collected.In addition,a novel 2D gaze estimation algorithm is proposed.This algorithm is on the basis of ViT(vision transformer)and incorporates a context broadcasting(CB)module,which significantly enhances the feature representation capability of the network model by integrating both eyes'features at different levels.Experimentally,this algorithm demonstrates superior performance on the dataset Young-Gaze.Its error is kept within 5.42 cm,so it surpasses the other existing 2D gaze estimation methods.Besides its notable performance on YoungGaze,it also shows good results when trained and tested on the public 2D gaze datasets GazeCapture and MPIIFaceGaze.The above facts indicate that the proposed algorithm is not only suitable for the adolescent,but also applicable for the adults effectively.

关 键 词:视线估计 头部姿态 CNN 特征融合 VIT 上下文广播 

分 类 号:TN911.73-34[电子电信—通信与信息系统] TP391.41[电子电信—信息与通信工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象