检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:尧京京 Jingjing Yao(School of Optical-Electrical and Computer Engineering,University of Shanghai for Science&Technology,Shanghai)
出 处:《建模与仿真》2025年第3期426-434,共9页Modeling and Simulation
摘 要:在本文中,我们基于Hopenet网络和视觉Transformer提出了一种用于RGB图像头部姿势估计的新方法,并设计了一种新颖的架构,由以下三个关键组件组成:(1)骨干网络,(2)视觉Transformer,(3)预测头。我们还对骨干网络进行了改进,采用多尺度空洞可分离卷积以增强特征提取能力。相比于传统卷积神经网络和视觉Transformer提取特征的方式,我们的骨干网络在降低图像分辨率的同时,能够更有效地保留关键信息。通过消融实验,我们验证了基于多尺度空洞可分离卷积的骨干网络在特征保留能力上优于传统的深度卷积网络和视觉Transformer架构。我们在300W-LP和AFLW2000数据集上进行了全面的实验与消融研究。实验结果表明,所提出的方法在头部姿势估计任务上,相较于Hopenet及部分基于Transformer编码器的方法(如HeadPosr),在准确性和鲁棒性方面均实现了显著提升。In this paper,we propose a novel approach for head pose estimation from RGB images,leveraging the Hopenet network and Vision Transformer.Our method introduces an innovative architecture comprising three key components:(1)a backbone network,(2)a Vision Transformer,and(3)a pre-diction head.To enhance feature extraction capabilities,we further improve the backbone network by incorporatingmulti-scale dilated separable convolutions.Compared to traditional convolutional neural networks and Vision Transformers for feature extraction,our backbone network effectively preserves critical information while reducing image resolution.Through ablation studies,we vali-date that the proposed backbone network,equipped with multi-scale dilated separable convolu-tions,outperforms conventional deep convolutional networks and Vision Transformer-based ar-chitectures in terms of feature retention.We conduct extensive experiments and ablation studies on the 300W-LP and AFLW2000 datasets.Experimental results demonstrate that our approach sig-nificantly improves both accuracy and robustness in head pose estimation,outperforming Hopenet and certain Transformer-based encoder methods,such as HeadPose.
关 键 词:姿势估计 多尺度空洞可分离卷积 视觉Transformer Transformer编码器
分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.170